Cравнительный анализ канонической ДНК и стеблей тРНК¶
Задание 1 (обязательное)¶
Построить модели структур A-, B- и Z-формы ДНК с помощью инструментов пакета 3DNA. Пакет 3DNA - один из популярных пакетов программ для анализа и простейшего моделирования структур нуклеиновых кислот. Работает под операционной системой LINUX. Желающие могут почитать подробное описание пакета (в формате pdf).
С помощью программы fiber пакета 3DNA постройте A-, B- и Z-форму дуплекса ДНК, последовательность одной из нитей которого представляет собой 5 раз повторенную последовательность "gatc" (можно другую последовательность из 4-х разных нуклеотидов!). Структуру дуплекса в А-форме сохраните в файле gatc-a.pdb, структуру дуплекса в В-форме в файле gatc-b.pdb, структуру дуплекса в Z-форме в файле gatc-z.pdb.
См. подсказки. Форма контроля: ссылки на файлы из html и итоговая контрольная.
Пример запуска fiber, значения переменных среды остануться¶
%set_env PATH=/home/preps/golovin/progs/x3dna-v2.4/bin
%set_env X3DNA=/home/preps/golovin/progs/x3dna-v2.4
#fiber -h
env: PATH=/home/preps/golovin/progs/x3dna-v2.4/bin env: X3DNA=/home/preps/golovin/progs/x3dna-v2.4
! fiber
===========================================================================
NAME
fiber - generate 56 fiber models based on Arnott and other's work
SYNOPSIS
fiber [OPTION] PDBFILE
DESCRIPTION
generate 56 fiber models based on the repeating unit from Arnott's
work, including the canonical A-, B-, C- and Z-DNA, triplex, etc.
-cif output structure coordinates in mmCIF format
-num a structure identification number in the range (1-56)
-m, -l brief description of the 56 fiber structures
-a, -1 A-DNA model (calf thymus)
-b, -4 B-DNA (calf thymus, default)
-c, -47 C-DNA (BII-type nucleotides)
-d, -48 D(A)-DNA poly d(AT) : poly d(AT) (right-handed)
-z, -15 Z-DNA poly d(GC) : poly d(GC)
-pauling triplex RNA model, based on Pauling and Corey
-rna RNA duplex with arbitrary base sequence
-seq=string specifying an arbitrary base sequence
-rep=number no. of repeats of the sequence specified via -seq
-single output a single-stranded structure
-h this help message (any non-recognized options will do)
INPUT
An structure identification number (or symbol)
EXAMPLES
fiber fiber-BDNA.pdb
# fiber -4 fiber-BDNA.pdb
# fiber -b fiber-BDNA.pdb
fiber -a fiber-ADNA.pdb
fiber -seq=AAAGGUUU -rna fiber-RNA.pdb
fiber -seq=AAAGGUUU -rna -single fiber-ssRNA.pdb
fiber -seq=AATTG -rep=5 B-AATTG-repeat5.pdb
# triplex model (RNA), based on Pauling and Corey
fiber -pauling triplex-C10C10C10.pdb # default: 10 Cs per strand
fiber -pauling -seq=AAA triplex-A3A3A3.pdb # 3 As per strand
fiber -pauling -seq=AAAA:CCCC:GGGG Pauling-triplex-A4C4G4.pdb
# triplex DNA model, with O2' atoms removed
fiber -pauling-dna -seq=ACTT -repeat=6 Pauling-DNA-triplex.pdb
OUTPUT
PDB file
SEE ALSO
analyze, anyhelix, find_pair
AUTHOR
3DNA v2.4.5-2021may19, created and maintained by xiangjun@x3dna.org
LICENSE
Creative Commons Attribution Non-Commercial License (CC BY-NC 4.0)
See http://creativecommons.org/licenses/by-nc/4.0/ for details
NOTE
*** 3DNA HAS BEEN SUPERSEDED BY DSSR ***
Please post questions/comments on the 3DNA Forum: http://forum.x3dna.org/
===========================================================================
Задание 2 (обязательное)¶
Сравнение сгенерированной структуры одной из 3-х форм ДНК с со структурой той же формы полученной экспериментальными данными. Упр.1. Научиться находить большие и малые бороздки. Откройте в NGLview файл с выбранной вами сгенерированнй структурой, полученной вами при выполнении задания 1, и одной из pdb структур той же формы на ваш выбор: 1tne (Z-form); 1bna (B-form); 3v9d (A-form).
Рассмотрите экспериментальную структуру и визуально определите большую и малую бороздку. Выберите заданное вам азотистое основание в любом удобном для вас месте структуры. Определите, какие атомы основания явно обращены в сторону большой бороздки, а какие в сторону малой.
С помощью PyMol получите изображение основания, выделите красным цветом атомы, смотрящие в сторону большой бороздки, синим - в сторону малой. Посмотрите, как в файле PDB называются эти атомы, и приведите на html странице резюме следующего вида:
- В сторону большой бороздки обращены атомы с13.n4,....... (13 - номер выбранной позиции)
- В сторону малой бороздки обращены атомы ......."
Скопируйте в отчет следующую таблицу. Изучите структуры, полученные данные внесите в таблицу.
| A-форма | B-форма | Z-форма | |
|---|---|---|---|
| Тип спирали (правая или левая) | |||
| Шаг спирали (Å) | |||
| Число оснований на виток | |||
| Ширина большой бороздки | |||
| Ширина малой бороздки |
При заполнении двух нижних строк указывайте, от фосфата какого нуклеотида измерялась ширина бороздок.
Задание 3 (обязательное, * отмечены упражнения, результаты которых необходимо привести в отчёте)¶
Определение параметров структур нуклеиновых кислот с помощью программ пакета 3DNA.
Внимание! Пакет 3DNA пока работает только со старым форматом PDB. Для перевода файлов в старый формат используйте программу remediator, установленную на kodomo. Синтаксис:
! remediator --old ''XXXX.pdb'' > ''XXXX_old.pdb
Для анализа структур нуклеиновых кислот будем использовать программы find_pair и analyze. См. подсказки. .
Упр.1.¶
- Научиться определять торсионные углы нуклеотидов.
- Определить значения торсионных углов в заданной структуре тРНК; определить, на какую из форм ДНК больше всего похожи тяжи этой структуры.
Упр.2. *¶
- Научиться определять структуру водородных связей между основаниями, в отчете надо привести координаты стеблей.
- Определить номера нуклеотидов, образующих стебли(stems) во вторичной структуре заданной тРНК.
- Определить неканонические пары оснований в структуре тРНК.
- Определить, есть дополнительные водородные связи в тРНК, стабилизирующие ее третичную структуру (для этого следует рассмотреть комплементарные пары, не имеющие отношения к стеблям).
Упр.3.¶
Научиться находить возможные стекинг-взаимодействия:
- Откройте файл ХХХХ.out с характеристикой структуры тРНК.
- Найдите данные о величине площади "перекрывании" 2-х последовательных пар азотистых оснований. Для пар с наибольшими значениями получите стандартное изображение стекинг-взаимодействия.
Полезно также:
- сравнить изображения с максимальной и минимальной площадью перекрывания;
- проверить взаимную ориентацию оснований.
Некотрые примеры¶
from IPython.display import Image
Image("/home/preps/golovin/test/pic1.png")
Или Markdown

Добавление нового пути к окружениям conda¶
Запустите новый терминал в Jupyter:
Добавьте путь в настройки
conda config --append envs_dirs /home/preps/golovin/miniconda3/envsДобавьте новый ipykernel в список Jupyter
conda activate na2 python -m ipykernel install --user --name na2 --display-name "Python NA2"
import forgi
Help on function get_parser in module rna_tools.tools.rna_x3dna.rna_x3dna: get_parser()
import matplotlib.pyplot as plt
import forgi.visual.mplotlib as fvm
import forgi
cg = forgi.load_rna("1tra.pdb", allow_many=False, )
fvm.plot_rna(cg, text_kwargs={"fontweight":"black"}, lighten=0.7,
backbone_kwargs={"linewidth":3})
plt.show()
help(cg)
Help on CoarseGrainRNA in module forgi.threedee.model.coarse_grain object:
class CoarseGrainRNA(forgi.graph.bulge_graph.BulgeGraph)
| CoarseGrainRNA(graph_construction, sequence, name=None, infos=None, _dont_split=False)
|
| A coarse grain model of RNA structure based on the
| bulge graph representation.
|
| Each stem is represented by four parameters (two endpoints)
| and two twist vetors pointing towards the centers of the base
| pairs at each end of the helix.
|
| Method resolution order:
| CoarseGrainRNA
| forgi.graph.bulge_graph.BulgeGraph
| forgi.graph._basegraph.BaseGraph
| builtins.object
|
| Methods defined here:
|
| __init__(self, graph_construction, sequence, name=None, infos=None, _dont_split=False)
| Initialize the new structure.
|
| add_all_virtual_residues(self)
| Calls ftug.add_virtual_residues() for all stems of this RNA.
|
| .. note::
| Don't forget to call this again if you changed the structure of the RNA,
| to avoid leaving it in an inconsistent state.
|
| .. warning::
| Virtual residues are only added to stems, not to loop regions.
| The position of residues in loops is much more flexible, which is why virtual
| residue positions for loops usually do not make sense. If the file
| was loaded from the PDB, residue positions from the PDB file are
| stored already.
|
| add_bulge_coords_from_stems(self)
| Add the information about the starts and ends of the bulges (i and m elements).
| The stems have to be created beforehand.
|
| This is called during loading of the RNA structure from pdb and from cg files.
|
| after_coordinates_changed(self)
|
| coords_from_directions(self, directions)
| Generate coordinates from direction vectors (using also their lengths)
|
| Currently ignores the twists!
|
| :param directions: An array of vectors from the side of a cg-element with lower nucleotide number to the side with higher number
| The array is sorted by the corresponding element names alphabetically (`sorted(defines.keys()`)
|
| coords_to_directions(self)
| The directions of each coarse grain element. One line per cg-element.
|
| The array is sorted by the corresponding element names alphabetically (`sorted(defines.keys()`)
| The directions usually point away from the elemnt's lowest nucleotide.
| However h,t and f elements always point away from the connected stem.
|
| element_physical_distance(self, element1, element2)
| Calculate the physical distance between two coarse grain elements.
|
| :param element1: The name of the first element (e.g. 's1')
| :param element2: The name of the first element (e.g. 's2')
| :return: The closest distance between the two elements.
|
| get_bulge_angle_stats(self, bulge)
| Return the angle stats for a particular bulge. These stats describe
| the relative orientation of the two stems that it connects.
|
| :param bulge: The name of the bulge.
| :param connections: The two stems that are connected by it.
| :return: The angle statistics in one direction and angle statistics in
| the other direction
|
| get_bulge_angle_stats_core(self, elem, forward=True)
| Return the angle stats for a particular bulge. These stats describe the
| relative orientation of the two stems that it connects.
|
| :param elem: The name of the bulge.
| :param connections: The two stems that are connected by it.
| :return: ftms.AngleStat object
|
| get_coordinates_array(self)
| Get all of the coordinates in one large array.
|
| The coordinates are sorted in the order of the keys
| in coordinates dictionary.
|
| :return: A 2D numpy array containing all coordinates
|
| get_loop_stat(self, d)
| Return the statistics for this loop.
|
| These stats describe the relative orientation of the loop to the stem
| to which it is attached.
|
| :param d: The name of the loop
|
| get_ordered_stem_poss(self)
|
| get_ordered_virtual_residue_poss(self, return_elements=False)
| Get the coordinates of all stem's virtual residues in a consistent order.
|
| This is used for RMSD calculation.
| If no virtual_residue_positions are known, self.add_all_virtual_residues() is called
| automatically.
|
| :param return_elements: In addition to the positions, return a list with
| the cg-elements these coordinates belong to
| :returns: A numpy array.
|
| get_poss_for_domain(self, elements, mode='vres')
| Get an array of coordinates only for the elements specified.
|
| ..note::
|
| This code is still experimental in the current version of forgi.
|
| :param elements: A list of coarse grain element names.
|
| get_stacking_helices(self, method='Tyagi')
| EXPERIMENTAL
|
| Return all helices (longer stacking regions) as sets.
|
| Two stems and one bulge are in a stacking relation, if self.is_stacking(bulge) is true and the stems are connected to the bulge.
| Further more, a stem is in a stacking relation with itself.
| A helix is the transitive closure this stacking relation.
|
| :returns: A list of sets of element names.
|
| get_stats(self, d)
| Calls get_loop_stat/ get_bulge_angle_stats or get_stem_stats, depending on the element d.
|
| :returns: A 1- or 2 tuple of stats (2 in case of bulges. One for each direction)
|
| get_stem_stats(self, stem)
| Calculate the statistics for a stem and return them. These statistics will describe the
| length of the stem as well as how much it twists.
|
| :param stem: The name of the stem.
|
| :return: A StemStat structure containing the above information.
|
| get_twists(self, node)
| Get the array of twists for this node. If the node is a stem,
| then the twists will simply those stored in the array.
| If the node is an interior loop or a junction segment,
| then the twists will be the ones that are adjacent to it,
| projected to the plane normal to the element vector.
| If the node is a hairpin loop or a free end, then the same twist
| will be duplicated and returned twice.
|
| :param node: The name of the node
|
| get_virtual_residue(self, pos, allow_single_stranded=False)
| Get the virtual residue position in the global coordinate system
| for the nucleotide at position pos (1-based)
|
| :param pos: A 1-based nucleotide number
| :param allow_single_stranded: If True and pos is not in a stem, return a
| rough estimate for the residue position instead of raising an error.
| Currenly, for non-stem elements, these positions are on the axis of the cg-element.
|
| is_stacking(self, bulge, method='Tyagi', verbose=False)
| EXPERIMENTAL
|
| Reports, whether the stems connected by the given bulge are coaxially stacking.
|
| :param bulge: STRING. Name of a interior loop or multiloop (e.g. "m3")
| :param method": STRING. "Tyagi": Use cutoffs from doi:10.1261/rna.305307, PMCID: PMC1894924.
| :returns: A BOOLEAN.
|
| load_coordinates_array(self, coords)
| Read in an array of coordinates (as may be produced by get_coordinates_array)
| and replace the coordinates of this structure with it.
|
| :param coords: A 2D array of coordinates
| :return: self
|
| longrange_iterator(self, filter_connected=False)
| Iterate over all long range interactions in this molecule.
|
| :param filter_connected: Filter interactions that are between elements
| which are connected (mostly meaning multiloops
| which connect to the same end of the same stem)
| :return: A generator yielding long-range interaction tuples (i.e. ('s7', 'i2'))
|
| radius_of_gyration(self, method='vres')
| Calculate the radius of gyration of this structure.
|
| :param method: A STRING. one of
| "fast" (use only coordinates of coarse grained stems) or
| "vres" (use virtual residue coordinates of stems)
|
| :return: A number with the radius of gyration of this structure.
|
| reset_vatom_cache(self, key)
| Delete all cached information about virtual residues and virtual atoms.
| Used as on_call function for the observing of the self.coords dictionary.
|
| :param key: A coarse grain element name, e.g. "s1" or "m15"
|
| rotate(self, angle, axis='x', unit='radians')
|
| rotate_translate(self, offset, rotation_matrix)
| First translate the RNA by offset, then rotate by rotation matrix
|
| sorted_edges_for_mst(self)
| Keep track of all linked nodes. Used for the generation of the minimal spanning tree.
|
| This overrides the function in bulge graph and adds an additional sorting criterion
| with lowest priority.
| Elements that have no entry in self.sampled should be preferedly broken.
| This should ensure that the minimal spanning tree is the same after saving
| and loading an RNA to/from a file, if changes of the minimal spanning tree
| were performed by ernwin.
|
| stem_angle(self, stem1, stem2)
| Returns the angle between two stems.
|
| If they are connected via a single element,
| use the direction pointing away from this element for both stems.
| Otherwise, use the direction from start to end.
|
| stem_offset(self, ref_stem, stem2)
| How much is the offset between the start of stem 2
| and the axis of stem1.
|
| Assumes that stem1 and stem 2 are connected by a single bulge.
| Then the start of stem2 is defined to be the stem side
| closer to the bulge.
|
| steric_value(self, elements, method='r**-2')
| Estimate, how difficult a set of elements was to build,
| by counting the atom density around the center of these elements
|
| to_cg_string(self)
| Output this structure in string form.
|
| to_file(self, filename)
|
| total_length(self)
| Calculate the combined length of all the elements.
|
| virtual_atoms(self, key)
| Get virtual atoms for a key.
|
| :param key: An INTEGER: The number of the base in the RNA.
|
| :returns: A dict {atom:coords}, e.g. {"C8":np.array([x,y,z]), ...}
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| from_bg_string(cg_string) from builtins.type
| Populate this structure from the string
| representation of a graph.
|
| from_pdb(pdb_filename, load_chains=None, remove_pseudoknots=False, dissolve_length_one_stems=True, secondary_structure=None, filetype='pdb', annotation_tool=None, query_PDBeChem=False) from builtins.type
| :param load_chains: A list of chain_ids or None (all chains)
| :param secondary_structure: Only useful if we load only 1 component
| :param filetype: One of 'pdb' or 'cif'
| :param query_PDBeChem: If true, query the PDBeChem database whenever a
| modified residue with unknown 3-letter code
| is encountered.
| :param annotation_tool: One of "DSSR", "MC-Annotate", "forgi" or None.
| If this is None, we take the value of the configuration
| file (run forgi_config.py to create a config file).
| If no config file is given either, we see what tools are
| installed, preferring the newer DSSR over MC-Annotate and
| falling back to the fogi implementation if neither is in
| the PATH variable.
| If a string is given or the configuration file set,
| we never fall back to a different option but raise an
| error, if the requested tool is unavailable.
|
| ----------------------------------------------------------------------
| Readonly properties defined here:
|
| incomplete_elements
|
| interacting_elements
|
| transformed
|
| ----------------------------------------------------------------------
| Methods inherited from forgi.graph.bulge_graph.BulgeGraph:
|
| add_info(self, key, value)
|
| adjacent_stem_pairs_iterator(self)
| Iterate over all pairs of stems which are separated by some element.
|
| This will always yield triples of the form (s1, e1, s2) where s1 and
| s2 are the stem identifiers and e1 denotes the element that separates
| them.
|
| are_adjacent_stems(self, s1, s2, multiloops_count=True)
| Are two stems separated by only one element. If multiloops should not
| count as edges, then the appropriate parameter should be set.
|
| :param s1: The name of the first stem
| :param s2: The name of the second stem
| :param multiloops_count: Whether to count multiloops as an edge linking
| two stems
|
| buildorder_of(self, element)
| Returns the index into build_order where the element FIRST appears.
|
| :param element: Element name, a string. e.g. "m0" or "s0"
| :returns: An index into self.build_order or None, if the element is not
| part of the build_order (e.g. hairpin loops)
|
| connected(self, n1, n2)
| Are the nucleotides n1 and n2 connected?
|
| :param n1: A node in the BulgeGraph
| :param n2: Another node in the BulgeGraph
| :return: True or False indicating whether they are connected.
|
| connected_stem_iterator(self)
| Iterate over all pairs of connected stems.
|
| connection_ends(self, connection_type)
| Find out which ends of the stems are connected by a particular angle
| type.
|
| :param connection_type: The angle type, as determined by which corners
| of a stem are connected
| :return: (s1e, s2b) 0 means the side of the stem with the lowest nucleotide, 1 the other side
|
| connection_type(self, define, connections)
| Classify the way that two stems are connected according to the type
| of bulge that separates them.
|
| Potential angle types for single stranded segments, and the ends of
| the stems they connect:
|
| = = ====== ===========
| 1 2 (1, 1) #pseudoknot
| 1 0 (1, 0)
| 3 2 (0, 1)
| 3 0 (0, 0)
| = = ====== ===========
|
| :param define: The name of the bulge separating the two stems
| :param connections: The two stems and their separation
|
| :returns: INT connection type
|
| = ======================================================================
| + positive values mean forward (from the connected stem starting at the
| lower nucleotide number to the one starting at the higher nuc. number)
| - negative values mean backwards.
| 1 interior loop
| 2 first multi-loop segment of normal multiloops and most pseudoknots
| 3 middle segment of a normal multiloop
| 4 last segment of normal multiloops and most pseudoknots
| 5 middle segments of pseudoknots
| = ======================================================================
|
| define_a(self, elem)
| Returns the element define using the adjacent nucleotides (if present).
|
| If there is no adjacent nucleotide due to a chain break or chain end,
| then the nucleotide that is part of the element will be used instead.
|
| For stems, this always returns a list of length 4,
| for interior loops a list of length 2 or 4 and
| for other elements always a list of length 2.
|
|
| :param elem: An element name
| :returns: A list of integers
|
| define_residue_num_iterator(self, node, adjacent=False, seq_ids=False)
| Iterate over the residue numbers that belong to this node.
|
| :param node: The name of the node
|
| describe_multiloop(self, multiloop)
| :param multiloop: An iterable of nodes (only "m", "t" and "f" elements)
|
| element_length(self, key)
| Get the number of residues that are contained within this element.
|
| :param key: The name of the element.
|
| elements_to_nucleotides(self, elements)
| Convert a list of element names to a list of nucleotide numbers.
|
| Remove redundant entries.
|
| find_bulge_loop(self, vertex, max_length=4)
| Find a set of nodes that form a loop containing the
| given vertex and being no greater than max_length nodes long.
|
| :param vertex: The vertex to start the search from.
| :param max_length: Only fond loops that contain no more then this many elements
| :returns: A list of the nodes in the loop.
|
| find_mlonly_multiloops(self)
|
| flanking_nuc_at_stem_side(self, s, side)
| Return the nucleotide number that is next to the stem at the given stem side.
|
| :param side: 0, 1, 2 or 3, as returned by self._get_sides_plus
| :returns: The nucleotide position. If the stem has no neighbor at that side,
| 0 or self.seq_length+1 is returned instead.
|
| floop_iterator(self)
| Yield the name of the 5' prime unpaired region if it is
| present in the structure.
|
| get_angle_type(self, bulge, allow_broken=False)
| Return what type of angle this bulge is, based on the way this
| would be built using a breadth-first traversal along the minimum
| spanning tree.
|
| :param allow_broken: How to treat broken multiloop segments.
|
| * False (default): Return None
| * True: Return the angle type according to the build-order
| (i.e. from the first built stem to the last-built stem)
|
| get_bulge_dimensions(self, bulge, with_missing=False)
| Return the dimensions of the bulge.
|
| If it is single stranded it will be (x, -1) for h,t,f or (x, 1000) for m.
| Otherwise it will be (x, y).
|
| :param bulge: The name of the bulge.
| :return: A pair containing its dimensions
|
| get_connected_residues(self, s1, s2, bulge=None)
| Get the nucleotides which are connected by the element separating
| s1 and s2. They should be adjacent stems.
|
| :param s1, s2: 2 adjacent stems
| :param bulge: Optional: The bulge seperating the two stems.
| If s1 and s2 are connected by more than one element,
| this has to be given, or a ValueError will be raised.
| (useful for pseudoknots)
|
| The connected nucleotides are those which are spanned by a single
| interior loop or multiloop. In the case of an interior loop, this
| function will return a list of two tuples and in the case of multiloops
| if it will be a list of one tuple.
|
| If the two stems are not separated by a single element, then return
| an empty list.
|
| get_define_seq_str(self, elem, adjacent=False)
| Get a list containing the sequences for the given define.
|
| :param d: The element name for which to get the sequences
| :param adjacent: Boolean. Include adjacent nucleotides (for single stranded RNA only)
| :return: A list containing the sequence(s) corresponding to the defines
|
| get_domains(self)
| Get secondary structure domains.
|
| Currently domains found are:
| * multiloops (without any connected stems)
| * rods: stretches of stems + interior loops (without branching), with trailing hairpins
| * pseudoknots
|
| get_elem(self, position)
| Get the secondary structure element from a nucleotide position
|
| :param position: An integer or a fgr.RESID instance, describing the nucleotide number.
|
| get_flanking_handles(self, bulge_name, side=0)
| Get the indices of the residues for fitting bulge regions.
|
| So if there is a loop like so (between residues 7 and 16)::
|
| (((...))))
| 7890123456
| ^ ^
|
| Then residues 9 and 13 will be used as the handles against which
| to align the fitted region.
|
| In the fitted region, the residues (2,6) will be the ones that will
| be aligned to the handles.
|
| :return: (orig_chain_res1, orig_chain_res1, flanking_res1, flanking_res2)
|
| get_flanking_region(self, bulge_name, side=0)
| If a bulge is flanked by stems, return the lowest residue number
| of the previous stem and the highest residue number of the next
| stem.
|
| :param bulge_name: The name of the bulge
| :param side: The side of the bulge (indicating the strand)
|
| get_flanking_sequence(self, bulge_name, side=0)
| Return the sequence of a bulge and the adjacent strand of the adjacent stems.
|
| :param bulge_name: The name of the bulge, e.g. 'h0'
| :param side: Used for interior loops: The strand of interest (0=forward, 1=backward)
|
| get_length(self, vertex)
| Get the minimum length of a vertex.
|
| If it's a stem, then the result is its length (in base pairs).
|
| If it's a bulge, then the length is the smaller of it's dimensions.
|
| :param vertex: The name of the vertex.
|
| get_link_direction(self, stem1, stem2, bulge=None)
| Get the direction in which stem1 and stem2 are linked (by the bulge)
|
| :returns: 1 if the bulge connects stem1 with stem2 in forward direction (5' to 3')
| -1 otherwise
|
| get_mst(self)
| Create a minimum spanning tree from this BulgeGraph. This is useful
| for constructing a structure where each section of a multiloop is
| sampled independently and we want to introduce a break at the largest
| multiloop section.
|
| get_multiloop_nucleotides(self, multiloop_loop)
| Return a list of nucleotides which make up a particular
| multiloop.
|
| :param multiloop_loop: The elements which make up this multiloop
| :return: A list of nucleotides
|
| get_multiloop_side(self, m)
| Find out which strand a multiloop is on. An example of a situation in
| which the loop can be on both sides can be seen in the three-stemmed
| structure below:
|
| (.().().)
|
| In this case, the first multiloop section comes off of the 5' strand of
| the first stem (the prior stem is always the one with a lower numbered
| first residue). The second multiloop section comess of the 3' strand of
| the second stem and the third loop comes off the 3' strand of the third
| stem.
|
| get_next_ml_segment(self, ml_segment)
| Get the adjacent multiloop-segment (or 3' loop) next to the 3' side of ml_segment.
|
| If there is no other single stranded RNA after the stem, the backbone must end there.
| In that case return None.
|
| get_node_dimensions(self, node, with_missing=False)
| Return the dimensions of a node.
|
| If the node is a stem, then the dimensions will be l where l is
| the length of the stem.
|
| Otherwise, see get_bulge_dimensions(node)
|
| :param node: The name of the node
| :return: A pair containing its dimensions
|
| get_node_from_residue_num(self, base_num)
| USE get_elem instead.
|
| get_position_in_element(self, resnum)
| Return the position of the residue in the cg-element and the length of the element.
|
| :param resnum: An integer. The 1-based position in the total sequence.
| :returns: A tuple (p,l) where p is the position of the residue in the cg-element
| (0-based for stems, 1-based for loops) and p/l gives a measure for the position
| of the residue along the cg-element's axis (0 means at cg.coords[elem][0],
| 1 at cg.coords[elem][1] and 0.5 exactely in the middle of these two. )
|
| get_resseqs(self, define, seq_ids=True)
| Return the pdb ids of the nucleotides in this define.
|
| :param define: The name of this element.
| :param: Return a tuple of two arrays containing the residue ids
| on each strand
|
| get_side_nucleotides(self, stem, side)
| Get the nucleotide numbers on the given side of
| them stem. Side 0 corresponds to the 5' end of the
| stem whereas as side 1 corresponds to the 3' side
| of the stem.
|
| :param stem: The name of the stem
| :param side: Either 0 or 1, indicating the 5' or 3' end of the stem
| :return: A tuple of the nucleotide numbers on the given side of
| the stem.
|
| get_sides(self, s1, b)
| Get the side of s1 that is next to b.
|
| s1e -> s1b -> b
|
| :param s1: The stem.
| :param b: The bulge.
| :return: A tuple indicating which side is the one next to the bulge
| and which is away from the bulge.
|
| get_stem_edge(self, stem, pos)
| Returns the side (strand) of the stem that position is on.
|
| Side 0 corresponds to the 5' pairing residues in the
| stem whereas as side 1 corresponds to the 3' pairing
| residues in the stem.
| :param stem: The name of the stem
| :param pos: A position in the stem
| :return: 0 if pos on 5' edge of stem
|
| get_strand(self, multiloop)
| Get the strand on which this multiloop is located.
|
| :param multiloop: The name of the multiloop
| :return: 0 for being on the lower numbered strand and 1 for
| being on the higher numbered strand.
|
| has_connection(self, v1, v2)
| Is there an edge between these two nodes
|
| hloop_iterator(self)
| Iterator over all of the hairpin in the structure.
|
| iloop_iterator(self)
| Iterator over all of the interior loops in the structure.
|
| is_loop_pseudoknot(self, loop)
| Is a particular loop a pseudoknot?
|
| :param loop: A list of elements that are part of the loop (only m,f and t elements).
|
| :return: Either True or false
|
| is_single_stranded(self, node)
| Does this node represent a single-stranded region?
|
| Single stranded regions are five-prime and three-prime unpaired
| regions, multiloops, and hairpins
|
| .. warning::
| Interior loops are never considered single stranded by this function.
|
| :param node: The name of the node
| :return: True if yes, False if no
|
| iter_elements_along_backbone(self, startpos=1)
| Iterate all coarse grained elements along the backbone.
|
| Note that stems are yielded twice (for forward and backward strand).
| Interior loops may be yielded twice or once (if one side has no nucleotide)
|
| 0-length multiloop-segments are correctly yielded.
|
| :param startpos: The nucleotide position at which to start
| :yields: Coarse grained element names, like "s0", "i0"
|
| iterate_over_seqid_range(self, start_id, end_id)
| Iterate over the seq_ids between the start_id and end_id.
|
| length_one_stem_basepairs(self)
| Return a list of basepairs that correspond to length-1 stems.
|
| log(self, level=10)
|
| min_max_bp_distance(self, e1, e2)
| Get the minimum and maximum base pair distance between
| these two elements.
|
| If they are connected, the minimum distance will be 1.
| The maximum will be 1 + length(e1) + length(e1)
|
| :param e1: The name of the first element
| :param e2: The name of the second element
| :return: A tuple containing the minimum and maximum distance between
| the two elements.
|
| mloop_iterator(self)
| Iterator over all of the multiloops in the structure.
|
| nucleotides_to_elements(self, nucleotides)
| Convert a list of nucleotides (nucleotide numbers) to element names.
|
| Remove redundant entries and return a set.
|
| ..note::
| Use `self.get_node_from_residue_num` if you have only a single nucleotide number.
|
| pairing_partner(self, nucleotide_number)
| Return the base pairing partner of the nucleotide at position
| nucleotide_number. If this nucleotide is unpaired, return None.
|
| :param nucleotide_number: The position of the query nucleotide in the
| sequence or a RESID instance.
| :return: The number of the nucleotide base paired with the one at
| position nucleotide_number.
|
| pseudoknotted_basepairs(self, ignore_basepairs=())
| Return a list of base-pairs that will be removed to
| remove pseudoknots using the knotted2nested.py script.
|
| :param ignore_basepairs: An optional list of basepairs that
| knested2knotted will not consider present
| in the structure.
| :return: A list of base-pairs that can be removed.
|
| random_subgraph(self, subgraph_length=None)
| Return a random subgraph of this graph.
|
| :return: A list containing a the nodes comprising a random subgraph
|
| seq_id_to_pos(self, seq_id)
| Convert a pdb seq_id to a 1-based nucleotide position
|
| :param seq_id: An instance of RESID
|
| set_angle_types(self)
| Fill in the angle types based on the build order
|
| shortest_bg_loop(self, vertex)
| Find a shortest loop containing this node. The vertex should
| be a multiloop.
|
| :param vertex: The name of the vertex to find the loop.
| :return: A list containing the elements in the shortest cycle.
|
| shortest_mlonly_multiloop(self, ml_segment)
|
| shortest_path(self, e1, e2)
| Determine the shortest path between two elements (e1, e2)
| along the secondary structure.
|
| :param e1: The name of the first element
| :param e2: The name of the second element
| :return: A list of the element names along the shortest path
|
| sorted_element_iterator(self)
| Iterate over a list of the coarse grained elements sorted by the lowest numbered
| nucleotide in each stem. Multiloops with no nucleotide coordinates come last.
|
| sorted_stem_iterator(self)
| Iterate over a list of the stems sorted by the lowest numbered
| nucleotide in each stem.
|
| ss_distance(self, e1, e2)
| Calculate the distance between two elements (e1, e2)
| along the secondary structure. The distance only starts
| at the edge of each element, and is the closest distance
| between the two elements.
|
| :param e1: The name or nucleotide number of the first element
| :param e2: The name or nucleotide number of the second element
| :return: The integer distance between the two elements / residues along the secondary
| structure. (if a element is given, we use its corner for the distance, otherwise the exact nucleotide)
|
| stem_bp_iterator(self, stem, seq_ids=False)
| Iterate over all the base pairs in the stem.
|
| stem_iterator(self)
| Iterator over all of the stems in the structure.
|
| stem_length(self, key)
| Get the length of a particular element. If it's a stem, it's equal to
| the number of paired bases. If it's an interior loop, it's equal to the
| number of unpaired bases on the strand with less unpaired bases. If
| it's a multiloop, then it's the number of unpaired bases.
|
| stem_resn_to_stem_vres_side(self, stem, res)
|
| stem_side_vres_to_resn(self, stem, side, vres)
| Return the residue number given the stem name, the strand (side) it's on
| and the virtual residue number.
|
| tloop_iterator(self)
| Yield the name of the 3' prime unpaired region if it is
| present in the structure.
|
| to_bg_string(self)
| Output a string representation that can be stored and reloaded.
|
| to_bpseq_string(self)
| Create a bpseq string from this structure.
|
| to_dotbracket_string(self, include_missing=False)
| Convert the BulgeGraph representation to a dot-bracket string
| and return it.
|
| :return: A dot-bracket representation of this BulgeGraph
|
| to_element_string(self, with_numbers=False)
| Create a string similar to dotbracket notation that identifies what
| type of element is present at each location.
|
| For example the following dotbracket:
|
| ..((..))..
|
| Should yield the following element string:
|
| ffsshhsstt
|
| Indicating that it begins with a fiveprime region, continues with a
| stem, has a hairpin after the stem, the stem continues and it is terminated
| by a threeprime region.
|
| :param with_numbers: show the last digit of the element id in a second line.::
|
| (((.(((...))))))
|
| Could result in::
|
| sssissshhhssssss
| 0000111000111000
|
| Indicating that the first stem is named 's0', followed by 'i0','
| s1', 'h0', the second strand of 's1' and the second strand of 's0'
|
| to_fasta_string(self, include_missing=False)
| Output the BulgeGraph representation as a fast string of the
| format::
|
| >id
| AACCCAA
| ((...))
|
| :param include_missing: Whether or not residues for which no structure
| information is present should be included in the output.
|
| to_neato_string(self)
|
| to_networkx(self)
| Convert this graph to a networkx representation. This representation
| will contain all of the nucleotides as nodes and all of the base pairs
| as edges as well as the adjacent nucleotides.
|
| to_pair_table(self)
| Create a pair table from the list of elements.
|
| The first element in the returned list indicates the number of
| nucleotides in the structure.
|
| i.e. [5,5,4,0,2,1]
|
| to_pair_tuples(self, remove_basepairs=None)
| Create a list of tuples corresponding to all of the base pairs in the
| structure. Unpaired bases will be shown as being paired with a
| nucleotide numbered 0.
|
| i.e. [(1,5),(2,4),(3,0),(4,2),(5,1)]
|
| :param remove_basepairs: A list of 2-tuples containing
| basepairs that should be removed
|
| traverse_graph(self)
| Traverse the graph to get the angle types. The angle type depends on
| which corners of the stem are connected by the multiloop or internal
| loop.
|
| :returns: A list of triples (stem, loop, stem)
|
| ----------------------------------------------------------------------
| Class methods inherited from forgi.graph.bulge_graph.BulgeGraph:
|
| from_bg_file(bg_file) from builtins.type
| Load a BulgeGraph from a file containing a text-based representation.
|
| :param bg_file: The filename.
| :return: A bulge Graph.
|
| from_bpseq_str(bpseq_str, breakpoints=(), name=None, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
| Create the graph from a string listing the base pairs.
|
| The string should be formatted like so:
|
| 1 G 115
| 2 A 0
| 3 A 0
| 4 U 0
| 5 U 112
| 6 G 111
|
| :param bpseq_str: The string, containing newline characters.
| :param breakpoints: A list of positions, after which there is a backbone break.
| :return: A new BulgeGraph object.
|
| from_ct_string(ct_string, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
| Create the graph from a string holding a connectivity table.
| See http://x3dna.org/highlights/dssr-derived-secondary-structure-in-ct-format
|
| from_dotbracket(dotbracket_str, seq=None, name=None, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
| Create a BulgeGraph object from a dotbracket string.
|
| :param dotbracket_str: A string
| :param seq: A string, with the same length as the dotbracket string,
| a forgi.graph.sequence.Sequence instance or None.
| If it is None, the sequence will be all 'N's
| :param name: Optional string to use as molecule name.
|
| from_fasta(filename, dissolve_length_one_stems=False) from builtins.type
| Return a list of BulgeGraphs from a fasta file.
|
| from_fasta_text(fasta_text, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
| Create one or more Bulge Graphs from some fasta text.
|
| :returns: A list of BulgeGraphs
|
| ----------------------------------------------------------------------
| Readonly properties inherited from forgi.graph.bulge_graph.BulgeGraph:
|
| backbone_breaks_after
|
| junctions
| Get all regular multiloops of this structure.
|
| :return: A list of tuples of multiloop segments.
| Each tuple contains the segments of one regular
| (i.e. not pseudoknotted) multiloop.
|
| rods
|
| seq
|
| seq_length
|
| ----------------------------------------------------------------------
| Methods inherited from forgi.graph._basegraph.BaseGraph:
|
| connections(self, bulge)
| :param g: Graph-like: A BulgeGraph or BulgeGraphConstruction.
|
| define_range_iterator(self, node, adjacent=False)
| Return the ranges of the nucleotides in the define.
|
| In other words, if a define contains the following: [1,2,7,8]
| The ranges will be [1,2] and [7,8].
|
| :param adjacent: Use the nucleotides in the neighboring element which
| connect to this element as the range starts and ends.
| :return: A list of two-element lists
|
| flanking_nucleotides(self, d)
| Return the nucleotides directly flanking an element.
|
| :param d: the name of the element
| :return: a list of nucleotides
|
| ----------------------------------------------------------------------
| Data descriptors inherited from forgi.graph._basegraph.BaseGraph:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)