Cравнительный анализ канонической ДНК и стеблей тРНК¶

Задание 1 (обязательное)¶

  • Построить модели структур A-, B- и Z-формы ДНК с помощью инструментов пакета 3DNA. Пакет 3DNA - один из популярных пакетов программ для анализа и простейшего моделирования структур нуклеиновых кислот. Работает под операционной системой LINUX. Желающие могут почитать подробное описание пакета (в формате pdf).

  • С помощью программы fiber пакета 3DNA постройте A-, B- и Z-форму дуплекса ДНК, последовательность одной из нитей которого представляет собой 5 раз повторенную последовательность "gatc" (можно другую последовательность из 4-х разных нуклеотидов!). Структуру дуплекса в А-форме сохраните в файле gatc-a.pdb, структуру дуплекса в В-форме в файле gatc-b.pdb, структуру дуплекса в Z-форме в файле gatc-z.pdb.

  • См. подсказки. Форма контроля: ссылки на файлы из html и итоговая контрольная.

Пример запуска fiber, значения переменных среды остануться¶

In [1]:
%set_env PATH=/home/preps/golovin/progs/x3dna-v2.4/bin
%set_env X3DNA=/home/preps/golovin/progs/x3dna-v2.4
#fiber -h
env: PATH=/home/preps/golovin/progs/x3dna-v2.4/bin
env: X3DNA=/home/preps/golovin/progs/x3dna-v2.4
In [2]:
! fiber
===========================================================================
NAME
        fiber - generate 56 fiber models based on Arnott and other's work
SYNOPSIS
        fiber [OPTION] PDBFILE
DESCRIPTION
        generate 56 fiber models based on the repeating unit from Arnott's
        work, including the canonical A-, B-, C- and Z-DNA, triplex, etc.
        -cif     output structure coordinates in mmCIF format
        -num     a structure identification number in the range (1-56)
        -m, -l   brief description of the 56 fiber structures
        -a, -1   A-DNA model (calf thymus)
        -b, -4   B-DNA (calf thymus, default)
        -c, -47  C-DNA (BII-type nucleotides)
        -d, -48  D(A)-DNA  poly d(AT) : poly d(AT) (right-handed)
        -z, -15  Z-DNA poly d(GC) : poly d(GC)
        -pauling triplex RNA model, based on Pauling and Corey
        -rna     RNA duplex with arbitrary base sequence
        -seq=string specifying an arbitrary base sequence
        -rep=number no. of repeats of the sequence specified via -seq
        -single  output a single-stranded structure
        -h       this help message (any non-recognized options will do)
INPUT
        An structure identification number (or symbol)
EXAMPLES
        fiber fiber-BDNA.pdb
            # fiber -4 fiber-BDNA.pdb
            # fiber -b fiber-BDNA.pdb
        fiber -a fiber-ADNA.pdb
        fiber -seq=AAAGGUUU -rna fiber-RNA.pdb
        fiber -seq=AAAGGUUU -rna -single fiber-ssRNA.pdb
        fiber -seq=AATTG -rep=5 B-AATTG-repeat5.pdb
            # triplex model (RNA), based on Pauling and Corey
        fiber -pauling triplex-C10C10C10.pdb  # default: 10 Cs per strand
        fiber -pauling -seq=AAA triplex-A3A3A3.pdb  # 3 As per strand
        fiber -pauling -seq=AAAA:CCCC:GGGG Pauling-triplex-A4C4G4.pdb
            # triplex DNA model, with O2' atoms removed
        fiber -pauling-dna -seq=ACTT -repeat=6 Pauling-DNA-triplex.pdb
OUTPUT
        PDB file
SEE ALSO
        analyze, anyhelix, find_pair
AUTHOR
        3DNA v2.4.5-2021may19, created and maintained by xiangjun@x3dna.org
LICENSE
        Creative Commons Attribution Non-Commercial License (CC BY-NC 4.0)
        See http://creativecommons.org/licenses/by-nc/4.0/ for details
NOTE
        *** 3DNA HAS BEEN SUPERSEDED BY DSSR ***

Please post questions/comments on the 3DNA Forum: http://forum.x3dna.org/
===========================================================================

Задание 2 (обязательное)¶

  • Сравнение сгенерированной структуры одной из 3-х форм ДНК с со структурой той же формы полученной экспериментальными данными. Упр.1. Научиться находить большие и малые бороздки. Откройте в NGLview файл с выбранной вами сгенерированнй структурой, полученной вами при выполнении задания 1, и одной из pdb структур той же формы на ваш выбор: 1tne (Z-form); 1bna (B-form); 3v9d (A-form).

  • Рассмотрите экспериментальную структуру и визуально определите большую и малую бороздку. Выберите заданное вам азотистое основание в любом удобном для вас месте структуры. Определите, какие атомы основания явно обращены в сторону большой бороздки, а какие в сторону малой.

  • С помощью PyMol получите изображение основания, выделите красным цветом атомы, смотрящие в сторону большой бороздки, синим - в сторону малой. Посмотрите, как в файле PDB называются эти атомы, и приведите на html странице резюме следующего вида:

    • В сторону большой бороздки обращены атомы с13.n4,....... (13 - номер выбранной позиции)
    • В сторону малой бороздки обращены атомы ......."

Скопируйте в отчет следующую таблицу. Изучите структуры, полученные данные внесите в таблицу.

A-форма B-форма Z-форма
Тип спирали (правая или левая)
Шаг спирали (Å)
Число оснований на виток
Ширина большой бороздки
Ширина малой бороздки

При заполнении двух нижних строк указывайте, от фосфата какого нуклеотида измерялась ширина бороздок.

Задание 3 (обязательное, * отмечены упражнения, результаты которых необходимо привести в отчёте)¶

Определение параметров структур нуклеиновых кислот с помощью программ пакета 3DNA.

Внимание! Пакет 3DNA пока работает только со старым форматом PDB. Для перевода файлов в старый формат используйте программу remediator, установленную на kodomo. Синтаксис:

   ! remediator --old ''XXXX.pdb'' > ''XXXX_old.pdb

Для анализа структур нуклеиновых кислот будем использовать программы find_pair и analyze. См. подсказки. .

Упр.1.¶

  • Научиться определять торсионные углы нуклеотидов.
  • Определить значения торсионных углов в заданной структуре тРНК; определить, на какую из форм ДНК больше всего похожи тяжи этой структуры.

Упр.2. *¶

  • Научиться определять структуру водородных связей между основаниями, в отчете надо привести координаты стеблей.
  • Определить номера нуклеотидов, образующих стебли(stems) во вторичной структуре заданной тРНК.
  • Определить неканонические пары оснований в структуре тРНК.
  • Определить, есть дополнительные водородные связи в тРНК, стабилизирующие ее третичную структуру (для этого следует рассмотреть комплементарные пары, не имеющие отношения к стеблям).

Упр.3.¶

Научиться находить возможные стекинг-взаимодействия:

  • Откройте файл ХХХХ.out с характеристикой структуры тРНК.
  • Найдите данные о величине площади "перекрывании" 2-х последовательных пар азотистых оснований. Для пар с наибольшими значениями получите стандартное изображение стекинг-взаимодействия.

Полезно также:

  • сравнить изображения с максимальной и минимальной площадью перекрывания;
  • проверить взаимную ориентацию оснований.

Некотрые примеры¶

In [2]:
from IPython.display import Image
Image("/home/preps/golovin/test/pic1.png")
Out[2]:
No description has been provided for this image

Или Markdown

![mymegapic](pic1.png)

Добавление нового пути к окружениям conda¶

  • Запустите новый терминал в Jupyter:

  • Добавьте путь в настройки conda config --append envs_dirs /home/preps/golovin/miniconda3/envs

  • Добавьте новый ipykernel в список Jupyter conda activate na2 python -m ipykernel install --user --name na2 --display-name "Python NA2"

In [2]:
import forgi
In [10]:
 
Help on function get_parser in module rna_tools.tools.rna_x3dna.rna_x3dna:

get_parser()

In [7]:
import matplotlib.pyplot as plt
import forgi.visual.mplotlib as fvm
import forgi
cg = forgi.load_rna("1tra.pdb", allow_many=False, )
fvm.plot_rna(cg, text_kwargs={"fontweight":"black"}, lighten=0.7,
             backbone_kwargs={"linewidth":3})
plt.show()
No description has been provided for this image
In [5]:
help(cg)
Help on CoarseGrainRNA in module forgi.threedee.model.coarse_grain object:

class CoarseGrainRNA(forgi.graph.bulge_graph.BulgeGraph)
 |  CoarseGrainRNA(graph_construction, sequence, name=None, infos=None, _dont_split=False)
 |  
 |  A coarse grain model of RNA structure based on the
 |  bulge graph representation.
 |  
 |  Each stem is represented by four parameters (two endpoints)
 |  and two twist vetors pointing towards the centers of the base
 |  pairs at each end of the helix.
 |  
 |  Method resolution order:
 |      CoarseGrainRNA
 |      forgi.graph.bulge_graph.BulgeGraph
 |      forgi.graph._basegraph.BaseGraph
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, graph_construction, sequence, name=None, infos=None, _dont_split=False)
 |      Initialize the new structure.
 |  
 |  add_all_virtual_residues(self)
 |      Calls ftug.add_virtual_residues() for all stems of this RNA.
 |      
 |      .. note::
 |         Don't forget to call this again if you changed the structure of the RNA,
 |         to avoid leaving it in an inconsistent state.
 |      
 |      .. warning::
 |         Virtual residues are only added to stems, not to loop regions.
 |         The position of residues in loops is much more flexible, which is why virtual
 |         residue positions for loops usually do not make sense. If the file
 |         was loaded from the PDB, residue positions from the PDB file are
 |         stored already.
 |  
 |  add_bulge_coords_from_stems(self)
 |      Add the information about the starts and ends of the bulges (i and m elements).
 |      The stems have to be created beforehand.
 |      
 |      This is called during loading of the RNA structure from pdb and from cg files.
 |  
 |  after_coordinates_changed(self)
 |  
 |  coords_from_directions(self, directions)
 |      Generate coordinates from direction vectors (using also their lengths)
 |      
 |      Currently ignores the twists!
 |      
 |      :param directions: An array of vectors from the side of a cg-element with lower nucleotide number to the side with higher number
 |                         The array is sorted by the corresponding element names alphabetically (`sorted(defines.keys()`)
 |  
 |  coords_to_directions(self)
 |      The directions of each coarse grain element. One line per cg-element.
 |      
 |      The array is sorted by the corresponding element names alphabetically (`sorted(defines.keys()`)
 |      The directions usually point away from the elemnt's lowest nucleotide.
 |      However h,t and f elements always point away from the connected stem.
 |  
 |  element_physical_distance(self, element1, element2)
 |      Calculate the physical distance between two coarse grain elements.
 |      
 |      :param element1: The name of the first element (e.g. 's1')
 |      :param element2: The name of the first element (e.g. 's2')
 |      :return: The closest distance between the two elements.
 |  
 |  get_bulge_angle_stats(self, bulge)
 |      Return the angle stats for a particular bulge. These stats describe
 |      the relative orientation of the two stems that it connects.
 |      
 |      :param bulge: The name of the bulge.
 |      :param connections: The two stems that are connected by it.
 |      :return: The angle statistics in one direction and angle statistics in
 |               the other direction
 |  
 |  get_bulge_angle_stats_core(self, elem, forward=True)
 |      Return the angle stats for a particular bulge. These stats describe the
 |      relative orientation of the two stems that it connects.
 |      
 |      :param elem: The name of the bulge.
 |      :param connections: The two stems that are connected by it.
 |      :return: ftms.AngleStat object
 |  
 |  get_coordinates_array(self)
 |      Get all of the coordinates in one large array.
 |      
 |      The coordinates are sorted in the order of the keys
 |      in coordinates dictionary.
 |      
 |      :return: A 2D numpy array containing all coordinates
 |  
 |  get_loop_stat(self, d)
 |      Return the statistics for this loop.
 |      
 |      These stats describe the relative orientation of the loop to the stem
 |      to which it is attached.
 |      
 |      :param d: The name of the loop
 |  
 |  get_ordered_stem_poss(self)
 |  
 |  get_ordered_virtual_residue_poss(self, return_elements=False)
 |      Get the coordinates of all stem's virtual residues in a consistent order.
 |      
 |      This is used for RMSD calculation.
 |      If no virtual_residue_positions are known, self.add_all_virtual_residues() is called
 |      automatically.
 |      
 |      :param return_elements: In addition to the positions, return a list with
 |                              the cg-elements these coordinates belong to
 |      :returns: A numpy array.
 |  
 |  get_poss_for_domain(self, elements, mode='vres')
 |      Get an array of coordinates only for the elements specified.
 |      
 |      ..note::
 |      
 |          This code is still experimental in the current version of forgi.
 |      
 |      :param elements: A list of coarse grain element names.
 |  
 |  get_stacking_helices(self, method='Tyagi')
 |      EXPERIMENTAL
 |      
 |      Return all helices (longer stacking regions) as sets.
 |      
 |      Two stems and one bulge are in a stacking relation, if self.is_stacking(bulge) is true and the stems are connected to the bulge.
 |      Further more, a stem is in a stacking relation with itself.
 |      A helix is the transitive closure this stacking relation.
 |      
 |      :returns: A list of sets of element names.
 |  
 |  get_stats(self, d)
 |      Calls get_loop_stat/ get_bulge_angle_stats or get_stem_stats, depending on the element d.
 |      
 |      :returns: A 1- or 2 tuple of stats (2 in case of bulges. One for each direction)
 |  
 |  get_stem_stats(self, stem)
 |      Calculate the statistics for a stem and return them. These statistics will describe the
 |      length of the stem as well as how much it twists.
 |      
 |      :param stem: The name of the stem.
 |      
 |      :return: A StemStat structure containing the above information.
 |  
 |  get_twists(self, node)
 |      Get the array of twists for this node. If the node is a stem,
 |      then the twists will simply those stored in the array.
 |      If the node is an interior loop or a junction segment,
 |      then the twists will be the ones that are adjacent to it,
 |      projected to the plane normal to the element vector.
 |      If the node is a hairpin loop or a free end, then the same twist
 |      will be duplicated and returned twice.
 |      
 |      :param node: The name of the node
 |  
 |  get_virtual_residue(self, pos, allow_single_stranded=False)
 |      Get the virtual residue position in the global coordinate system
 |      for the nucleotide at position pos (1-based)
 |      
 |      :param pos: A 1-based nucleotide number
 |      :param allow_single_stranded: If True and pos is not in a stem, return a
 |            rough estimate for the residue position instead of raising an error.
 |            Currenly, for non-stem elements, these positions are on the axis of the cg-element.
 |  
 |  is_stacking(self, bulge, method='Tyagi', verbose=False)
 |      EXPERIMENTAL
 |      
 |      Reports, whether the stems connected by the given bulge are coaxially stacking.
 |      
 |      :param bulge: STRING. Name of a interior loop or multiloop (e.g. "m3")
 |      :param method": STRING. "Tyagi": Use cutoffs from doi:10.1261/rna.305307, PMCID: PMC1894924.
 |      :returns: A BOOLEAN.
 |  
 |  load_coordinates_array(self, coords)
 |      Read in an array of coordinates (as may be produced by get_coordinates_array)
 |      and replace the coordinates of this structure with it.
 |      
 |      :param coords: A 2D array of coordinates
 |      :return: self
 |  
 |  longrange_iterator(self, filter_connected=False)
 |      Iterate over all long range interactions in this molecule.
 |      
 |      :param filter_connected: Filter interactions that are between elements
 |                               which are connected (mostly meaning multiloops
 |                               which connect to the same end of the same stem)
 |      :return: A generator yielding long-range interaction tuples (i.e. ('s7', 'i2'))
 |  
 |  radius_of_gyration(self, method='vres')
 |      Calculate the radius of gyration of this structure.
 |      
 |      :param method: A STRING. one of
 |                     "fast" (use only coordinates of coarse grained stems) or
 |                     "vres" (use virtual residue coordinates of stems)
 |      
 |      :return: A number with the radius of gyration of this structure.
 |  
 |  reset_vatom_cache(self, key)
 |      Delete all cached information about virtual residues and virtual atoms.
 |      Used as on_call function for the observing of the self.coords dictionary.
 |      
 |      :param key: A coarse grain element name, e.g. "s1" or "m15"
 |  
 |  rotate(self, angle, axis='x', unit='radians')
 |  
 |  rotate_translate(self, offset, rotation_matrix)
 |      First translate the RNA by offset, then rotate by rotation matrix
 |  
 |  sorted_edges_for_mst(self)
 |      Keep track of all linked nodes. Used for the generation of the minimal spanning tree.
 |      
 |      This overrides the function in bulge graph and adds an additional sorting criterion
 |      with lowest priority.
 |      Elements that have no entry in self.sampled should be preferedly broken.
 |      This should ensure that the minimal spanning tree is the same after saving
 |      and loading an RNA to/from a file, if changes of the minimal spanning tree
 |      were performed by ernwin.
 |  
 |  stem_angle(self, stem1, stem2)
 |      Returns the angle between two stems.
 |      
 |      If they are connected via a single element,
 |      use the direction pointing away from this element for both stems.
 |      Otherwise, use the direction from start to end.
 |  
 |  stem_offset(self, ref_stem, stem2)
 |      How much is the offset between the start of stem 2
 |      and the axis of stem1.
 |      
 |      Assumes that stem1 and stem 2 are connected by a single bulge.
 |      Then the start of stem2 is defined to be the stem side
 |      closer to the bulge.
 |  
 |  steric_value(self, elements, method='r**-2')
 |      Estimate, how difficult a set of elements was to build,
 |      by counting the atom density around the center of these elements
 |  
 |  to_cg_string(self)
 |      Output this structure in string form.
 |  
 |  to_file(self, filename)
 |  
 |  total_length(self)
 |      Calculate the combined length of all the elements.
 |  
 |  virtual_atoms(self, key)
 |      Get virtual atoms for a key.
 |      
 |      :param key: An INTEGER: The number of the base in the RNA.
 |      
 |      :returns: A dict {atom:coords}, e.g. {"C8":np.array([x,y,z]), ...}
 |  
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |  
 |  from_bg_string(cg_string) from builtins.type
 |      Populate this structure from the string
 |      representation of a graph.
 |  
 |  from_pdb(pdb_filename, load_chains=None, remove_pseudoknots=False, dissolve_length_one_stems=True, secondary_structure=None, filetype='pdb', annotation_tool=None, query_PDBeChem=False) from builtins.type
 |      :param load_chains: A list of chain_ids or None (all chains)
 |      :param secondary_structure: Only useful if we load only 1 component
 |      :param filetype: One of 'pdb' or 'cif'
 |      :param query_PDBeChem: If true, query the PDBeChem database whenever a
 |                      modified residue with unknown 3-letter code
 |                      is encountered.
 |      :param annotation_tool: One of "DSSR", "MC-Annotate", "forgi" or None.
 |                      If this is None, we take the value of the configuration
 |                      file (run forgi_config.py to create a config file).
 |                      If no config file is given either, we see what tools are
 |                      installed, preferring the newer DSSR over MC-Annotate and
 |                      falling back to the fogi implementation if neither is in
 |                      the PATH variable.
 |                      If a string is given or the configuration file set,
 |                      we never fall back to a different option but raise an
 |                      error, if the requested tool is unavailable.
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |  
 |  incomplete_elements
 |  
 |  interacting_elements
 |  
 |  transformed
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from forgi.graph.bulge_graph.BulgeGraph:
 |  
 |  add_info(self, key, value)
 |  
 |  adjacent_stem_pairs_iterator(self)
 |      Iterate over all pairs of stems which are separated by some element.
 |      
 |      This will always yield triples of the form (s1, e1, s2) where s1 and
 |      s2 are the stem identifiers and e1 denotes the element that separates
 |      them.
 |  
 |  are_adjacent_stems(self, s1, s2, multiloops_count=True)
 |      Are two stems separated by only one element. If multiloops should not
 |      count as edges, then the appropriate parameter should be set.
 |      
 |      :param s1: The name of the first stem
 |      :param s2: The name of the second stem
 |      :param multiloops_count: Whether to count multiloops as an edge linking
 |                               two stems
 |  
 |  buildorder_of(self, element)
 |      Returns the index into build_order where the element FIRST appears.
 |      
 |      :param element: Element name, a string. e.g. "m0" or "s0"
 |      :returns: An index into self.build_order or None, if the element is not
 |                part of the build_order (e.g. hairpin loops)
 |  
 |  connected(self, n1, n2)
 |      Are the nucleotides n1 and n2 connected?
 |      
 |      :param n1: A node in the BulgeGraph
 |      :param n2: Another node in the BulgeGraph
 |      :return: True or False indicating whether they are connected.
 |  
 |  connected_stem_iterator(self)
 |      Iterate over all pairs of connected stems.
 |  
 |  connection_ends(self, connection_type)
 |      Find out which ends of the stems are connected by a particular angle
 |      type.
 |      
 |      :param connection_type: The angle type, as determined by which corners
 |                              of a stem are connected
 |      :return: (s1e, s2b) 0 means the side of the stem with the lowest nucleotide, 1 the other side
 |  
 |  connection_type(self, define, connections)
 |      Classify the way that two stems are connected according to the type
 |      of bulge that separates them.
 |      
 |      Potential angle types for single stranded segments, and the ends of
 |      the stems they connect:
 |      
 |      =   = ======  ===========
 |      1   2 (1, 1)  #pseudoknot
 |      1   0 (1, 0)
 |      3   2 (0, 1)
 |      3   0 (0, 0)
 |      =   = ======  ===========
 |      
 |      :param define: The name of the bulge separating the two stems
 |      :param connections: The two stems and their separation
 |      
 |      :returns: INT connection type
 |      
 |                =   ======================================================================
 |                +   positive values mean forward (from the connected stem starting at the
 |                    lower nucleotide number to the one starting at the higher nuc. number)
 |                -   negative values mean backwards.
 |                1   interior loop
 |                2   first multi-loop segment of normal multiloops and most pseudoknots
 |                3   middle segment of a normal multiloop
 |                4   last segment of normal multiloops and most pseudoknots
 |                5   middle segments of pseudoknots
 |                =   ======================================================================
 |  
 |  define_a(self, elem)
 |      Returns the element define using the adjacent nucleotides (if present).
 |      
 |      If there is no adjacent nucleotide due to a chain break or chain end,
 |      then the nucleotide that is part of the element will be used instead.
 |      
 |      For stems, this always returns a list of length 4,
 |      for interior loops a list of length 2 or 4 and
 |      for other elements always a list of length 2.
 |      
 |      
 |      :param elem: An element name
 |      :returns: A list of integers
 |  
 |  define_residue_num_iterator(self, node, adjacent=False, seq_ids=False)
 |      Iterate over the residue numbers that belong to this node.
 |      
 |      :param node: The name of the node
 |  
 |  describe_multiloop(self, multiloop)
 |      :param multiloop: An iterable of nodes (only "m", "t" and "f" elements)
 |  
 |  element_length(self, key)
 |      Get the number of residues that are contained within this element.
 |      
 |      :param key: The name of the element.
 |  
 |  elements_to_nucleotides(self, elements)
 |      Convert a list of element names to a list of nucleotide numbers.
 |      
 |      Remove redundant entries.
 |  
 |  find_bulge_loop(self, vertex, max_length=4)
 |      Find a set of nodes that form a loop containing the
 |      given vertex and being no greater than max_length nodes long.
 |      
 |      :param vertex: The vertex to start the search from.
 |      :param max_length: Only fond loops that contain no more then this many elements
 |      :returns: A list of the nodes in the loop.
 |  
 |  find_mlonly_multiloops(self)
 |  
 |  flanking_nuc_at_stem_side(self, s, side)
 |      Return the nucleotide number that is next to the stem at the given stem side.
 |      
 |      :param side: 0, 1, 2 or 3, as returned by self._get_sides_plus
 |      :returns: The nucleotide position. If the stem has no neighbor at that side,
 |                0 or self.seq_length+1 is returned instead.
 |  
 |  floop_iterator(self)
 |      Yield the name of the 5' prime unpaired region if it is
 |      present in the structure.
 |  
 |  get_angle_type(self, bulge, allow_broken=False)
 |      Return what type of angle this bulge is, based on the way this
 |      would be built using a breadth-first traversal along the minimum
 |      spanning tree.
 |      
 |      :param allow_broken: How to treat broken multiloop segments.
 |      
 |                           * False (default): Return None
 |                           * True: Return the angle type according to the build-order
 |                             (i.e. from the first built stem to the last-built stem)
 |  
 |  get_bulge_dimensions(self, bulge, with_missing=False)
 |      Return the dimensions of the bulge.
 |      
 |      If it is single stranded it will be (x, -1) for h,t,f or (x, 1000) for m.
 |      Otherwise it will be (x, y).
 |      
 |      :param bulge: The name of the bulge.
 |      :return: A pair containing its dimensions
 |  
 |  get_connected_residues(self, s1, s2, bulge=None)
 |      Get the nucleotides which are connected by the element separating
 |      s1 and s2. They should be adjacent stems.
 |      
 |      :param s1, s2: 2 adjacent stems
 |      :param bulge: Optional: The bulge seperating the two stems.
 |                    If s1 and s2 are connected by more than one element,
 |                    this has to be given, or a ValueError will be raised.
 |                    (useful for pseudoknots)
 |      
 |      The connected nucleotides are those which are spanned by a single
 |      interior loop or multiloop. In the case of an interior loop, this
 |      function will return a list of two tuples and in the case of multiloops
 |      if it will be a list of one tuple.
 |      
 |      If the two stems are not separated by a single element, then return
 |      an empty list.
 |  
 |  get_define_seq_str(self, elem, adjacent=False)
 |      Get a list containing the sequences for the given define.
 |      
 |      :param d: The element name for which to get the sequences
 |      :param adjacent: Boolean. Include adjacent nucleotides (for single stranded RNA only)
 |      :return: A list containing the sequence(s) corresponding to the defines
 |  
 |  get_domains(self)
 |      Get secondary structure domains.
 |      
 |      Currently domains found are:
 |        * multiloops (without any connected stems)
 |        * rods: stretches of stems + interior loops (without branching), with trailing hairpins
 |        * pseudoknots
 |  
 |  get_elem(self, position)
 |      Get the secondary structure element from a nucleotide position
 |      
 |      :param position: An integer or a fgr.RESID instance, describing the nucleotide number.
 |  
 |  get_flanking_handles(self, bulge_name, side=0)
 |      Get the indices of the residues for fitting bulge regions.
 |      
 |      So if there is a loop like so (between residues 7 and 16)::
 |      
 |        (((...))))
 |        7890123456
 |          ^   ^
 |      
 |      Then residues 9 and 13 will be used as the handles against which
 |      to align the fitted region.
 |      
 |      In the fitted region, the residues (2,6) will be the ones that will
 |      be aligned to the handles.
 |      
 |      :return: (orig_chain_res1, orig_chain_res1, flanking_res1, flanking_res2)
 |  
 |  get_flanking_region(self, bulge_name, side=0)
 |      If a bulge is flanked by stems, return the lowest residue number
 |      of the previous stem and the highest residue number of the next
 |      stem.
 |      
 |      :param bulge_name: The name of the bulge
 |      :param side: The side of the bulge (indicating the strand)
 |  
 |  get_flanking_sequence(self, bulge_name, side=0)
 |      Return the sequence of a bulge and the adjacent strand of the adjacent stems.
 |      
 |      :param bulge_name: The name of the bulge, e.g. 'h0'
 |      :param side: Used for interior loops: The strand of interest (0=forward, 1=backward)
 |  
 |  get_length(self, vertex)
 |      Get the minimum length of a vertex.
 |      
 |      If it's a stem, then the result is its length (in base pairs).
 |      
 |      If it's a bulge, then the length is the smaller of it's dimensions.
 |      
 |      :param vertex: The name of the vertex.
 |  
 |  get_link_direction(self, stem1, stem2, bulge=None)
 |      Get the direction in which stem1 and stem2 are linked (by the bulge)
 |      
 |      :returns: 1 if the bulge connects stem1 with stem2 in forward direction (5' to 3')
 |                -1 otherwise
 |  
 |  get_mst(self)
 |      Create a minimum spanning tree from this BulgeGraph. This is useful
 |      for constructing a structure where each section of a multiloop is
 |      sampled independently and we want to introduce a break at the largest
 |      multiloop section.
 |  
 |  get_multiloop_nucleotides(self, multiloop_loop)
 |      Return a list of nucleotides which make up a particular
 |      multiloop.
 |      
 |      :param multiloop_loop: The elements which make up this multiloop
 |      :return: A list of nucleotides
 |  
 |  get_multiloop_side(self, m)
 |      Find out which strand a multiloop is on. An example of a situation in
 |      which the loop can be on both sides can be seen in the three-stemmed
 |      structure below:
 |      
 |          (.().().)
 |      
 |      In this case, the first multiloop section comes off of the 5' strand of
 |      the first stem (the prior stem is always the one with a lower numbered
 |      first residue). The second multiloop section comess of the 3' strand of
 |      the second stem and the third loop comes off the 3' strand of the third
 |      stem.
 |  
 |  get_next_ml_segment(self, ml_segment)
 |      Get the adjacent multiloop-segment (or 3' loop) next to the 3' side of ml_segment.
 |      
 |      If there is no other single stranded RNA after the stem, the backbone must end there.
 |      In that case return None.
 |  
 |  get_node_dimensions(self, node, with_missing=False)
 |      Return the dimensions of a node.
 |      
 |      If the node is a stem, then the dimensions will be l where l is
 |      the length of the stem.
 |      
 |      Otherwise, see get_bulge_dimensions(node)
 |      
 |      :param node: The name of the node
 |      :return: A pair containing its dimensions
 |  
 |  get_node_from_residue_num(self, base_num)
 |      USE get_elem instead.
 |  
 |  get_position_in_element(self, resnum)
 |      Return the position of the residue in the cg-element and the length of the element.
 |      
 |      :param resnum: An integer. The 1-based position in the total sequence.
 |      :returns: A tuple (p,l) where p is the position of the residue in the cg-element
 |                (0-based for stems, 1-based for loops) and p/l gives a measure for the position
 |                of the residue along the cg-element's axis (0 means at cg.coords[elem][0],
 |                1 at cg.coords[elem][1] and 0.5 exactely in the middle of these two. )
 |  
 |  get_resseqs(self, define, seq_ids=True)
 |      Return the pdb ids of the nucleotides in this define.
 |      
 |      :param define: The name of this element.
 |      :param: Return a tuple of two arrays containing the residue ids
 |              on each strand
 |  
 |  get_side_nucleotides(self, stem, side)
 |      Get the nucleotide numbers on the given side of
 |      them stem. Side 0 corresponds to the 5' end of the
 |      stem whereas as side 1 corresponds to the 3' side
 |      of the stem.
 |      
 |      :param stem: The name of the stem
 |      :param side: Either 0 or 1, indicating the 5' or 3' end of the stem
 |      :return: A tuple of the nucleotide numbers on the given side of
 |               the stem.
 |  
 |  get_sides(self, s1, b)
 |      Get the side of s1 that is next to b.
 |      
 |      s1e -> s1b -> b
 |      
 |      :param s1: The stem.
 |      :param b: The bulge.
 |      :return: A tuple indicating which side is the one next to the bulge
 |               and which is away from the bulge.
 |  
 |  get_stem_edge(self, stem, pos)
 |      Returns the side (strand) of the stem that position is on.
 |      
 |      Side 0 corresponds to the 5' pairing residues in the
 |      stem whereas as side 1 corresponds to the 3' pairing
 |      residues in the stem.
 |      :param stem: The name of the stem
 |      :param pos: A position in the stem
 |      :return: 0 if pos on 5' edge of stem
 |  
 |  get_strand(self, multiloop)
 |      Get the strand on which this multiloop is located.
 |      
 |      :param multiloop: The name of the multiloop
 |      :return: 0 for being on the lower numbered strand and 1 for
 |               being on the higher numbered strand.
 |  
 |  has_connection(self, v1, v2)
 |      Is there an edge between these two nodes
 |  
 |  hloop_iterator(self)
 |      Iterator over all of the hairpin in the structure.
 |  
 |  iloop_iterator(self)
 |      Iterator over all of the interior loops in the structure.
 |  
 |  is_loop_pseudoknot(self, loop)
 |      Is a particular loop a pseudoknot?
 |      
 |      :param loop: A list of elements that are part of the loop (only m,f and t elements).
 |      
 |      :return: Either True or false
 |  
 |  is_single_stranded(self, node)
 |      Does this node represent a single-stranded region?
 |      
 |      Single stranded regions are five-prime and three-prime unpaired
 |      regions, multiloops, and hairpins
 |      
 |      .. warning::
 |          Interior loops are never considered single stranded by this function.
 |      
 |      :param node: The name of the node
 |      :return: True if yes, False if no
 |  
 |  iter_elements_along_backbone(self, startpos=1)
 |      Iterate all coarse grained elements along the backbone.
 |      
 |      Note that stems are yielded twice (for forward and backward strand).
 |      Interior loops may be yielded twice or once (if one side has no nucleotide)
 |      
 |      0-length multiloop-segments are correctly yielded.
 |      
 |      :param startpos: The nucleotide position at which to start
 |      :yields: Coarse grained element names, like "s0", "i0"
 |  
 |  iterate_over_seqid_range(self, start_id, end_id)
 |      Iterate over the seq_ids between the start_id and end_id.
 |  
 |  length_one_stem_basepairs(self)
 |      Return a list of basepairs that correspond to length-1 stems.
 |  
 |  log(self, level=10)
 |  
 |  min_max_bp_distance(self, e1, e2)
 |      Get the minimum and maximum base pair distance between
 |      these two elements.
 |      
 |      If they are connected, the minimum distance will be 1.
 |      The maximum will be 1 + length(e1) + length(e1)
 |      
 |      :param e1: The name of the first element
 |      :param e2: The name of the second element
 |      :return:   A tuple containing the minimum and maximum distance between
 |                 the two elements.
 |  
 |  mloop_iterator(self)
 |      Iterator over all of the multiloops in the structure.
 |  
 |  nucleotides_to_elements(self, nucleotides)
 |      Convert a list of nucleotides (nucleotide numbers) to element names.
 |      
 |      Remove redundant entries and return a set.
 |      
 |      ..note::
 |          Use `self.get_node_from_residue_num` if you have only a single nucleotide number.
 |  
 |  pairing_partner(self, nucleotide_number)
 |      Return the base pairing partner of the nucleotide at position
 |      nucleotide_number. If this nucleotide is unpaired, return None.
 |      
 |      :param nucleotide_number: The position of the query nucleotide in the
 |                                sequence or a RESID instance.
 |      :return: The number of the nucleotide base paired with the one at
 |               position nucleotide_number.
 |  
 |  pseudoknotted_basepairs(self, ignore_basepairs=())
 |      Return a list of base-pairs that will be removed to
 |      remove pseudoknots using the knotted2nested.py script.
 |      
 |      :param ignore_basepairs: An optional list of basepairs that
 |                               knested2knotted will not consider present
 |                               in the structure.
 |      :return: A list of base-pairs that can be removed.
 |  
 |  random_subgraph(self, subgraph_length=None)
 |      Return a random subgraph of this graph.
 |      
 |      :return: A list containing a the nodes comprising a random subgraph
 |  
 |  seq_id_to_pos(self, seq_id)
 |      Convert a pdb seq_id to a 1-based nucleotide position
 |      
 |      :param seq_id: An instance of RESID
 |  
 |  set_angle_types(self)
 |      Fill in the angle types based on the build order
 |  
 |  shortest_bg_loop(self, vertex)
 |      Find a shortest loop containing this node. The vertex should
 |      be a multiloop.
 |      
 |      :param vertex: The name of the vertex to find the loop.
 |      :return: A list containing the elements in the shortest cycle.
 |  
 |  shortest_mlonly_multiloop(self, ml_segment)
 |  
 |  shortest_path(self, e1, e2)
 |      Determine the shortest path between two elements (e1, e2)
 |      along the secondary structure.
 |      
 |      :param e1: The name of the first element
 |      :param e2: The name of the second element
 |      :return: A list of the element names along the shortest path
 |  
 |  sorted_element_iterator(self)
 |      Iterate over a list of the coarse grained elements sorted by the lowest numbered
 |      nucleotide in each stem. Multiloops with no nucleotide coordinates come last.
 |  
 |  sorted_stem_iterator(self)
 |      Iterate over a list of the stems sorted by the lowest numbered
 |      nucleotide in each stem.
 |  
 |  ss_distance(self, e1, e2)
 |      Calculate the distance between two elements (e1, e2)
 |      along the secondary structure. The distance only starts
 |      at the edge of each element, and is the closest distance
 |      between the two elements.
 |      
 |      :param e1: The name or nucleotide number of the first element
 |      :param e2: The name or nucleotide number of the second element
 |      :return: The integer distance between the two elements / residues along the secondary
 |               structure. (if a element is given, we use its corner for the distance, otherwise the exact nucleotide)
 |  
 |  stem_bp_iterator(self, stem, seq_ids=False)
 |      Iterate over all the base pairs in the stem.
 |  
 |  stem_iterator(self)
 |      Iterator over all of the stems in the structure.
 |  
 |  stem_length(self, key)
 |      Get the length of a particular element. If it's a stem, it's equal to
 |      the number of paired bases. If it's an interior loop, it's equal to the
 |      number of unpaired bases on the strand with less unpaired bases. If
 |      it's a multiloop, then it's the number of unpaired bases.
 |  
 |  stem_resn_to_stem_vres_side(self, stem, res)
 |  
 |  stem_side_vres_to_resn(self, stem, side, vres)
 |      Return the residue number given the stem name, the strand (side) it's on
 |      and the virtual residue number.
 |  
 |  tloop_iterator(self)
 |      Yield the name of the 3' prime unpaired region if it is
 |      present in the structure.
 |  
 |  to_bg_string(self)
 |      Output a string representation that can be stored and reloaded.
 |  
 |  to_bpseq_string(self)
 |      Create a bpseq string from this structure.
 |  
 |  to_dotbracket_string(self, include_missing=False)
 |      Convert the BulgeGraph representation to a dot-bracket string
 |      and return it.
 |      
 |      :return: A dot-bracket representation of this BulgeGraph
 |  
 |  to_element_string(self, with_numbers=False)
 |      Create a string similar to dotbracket notation that identifies what
 |      type of element is present at each location.
 |      
 |      For example the following dotbracket:
 |      
 |      ..((..))..
 |      
 |      Should yield the following element string:
 |      
 |      ffsshhsstt
 |      
 |      Indicating that it begins with a fiveprime region, continues with a
 |      stem, has a hairpin after the stem, the stem continues and it is terminated
 |      by a threeprime region.
 |      
 |      :param with_numbers: show the last digit of the element id in a second line.::
 |      
 |                               (((.(((...))))))
 |      
 |                           Could result in::
 |      
 |                               sssissshhhssssss
 |                               0000111000111000
 |      
 |                           Indicating that the first stem is named 's0', followed by 'i0','
 |                           s1', 'h0', the second strand of 's1' and the second strand of 's0'
 |  
 |  to_fasta_string(self, include_missing=False)
 |      Output the BulgeGraph representation as a fast string of the
 |      format::
 |      
 |          >id
 |          AACCCAA
 |          ((...))
 |      
 |      :param include_missing: Whether or not residues for which no structure
 |                              information is present should be included in the output.
 |  
 |  to_neato_string(self)
 |  
 |  to_networkx(self)
 |      Convert this graph to a networkx representation. This representation
 |      will contain all of the nucleotides as nodes and all of the base pairs
 |      as edges as well as the adjacent nucleotides.
 |  
 |  to_pair_table(self)
 |      Create a pair table from the list of elements.
 |      
 |      The first element in the returned list indicates the number of
 |      nucleotides in the structure.
 |      
 |      i.e. [5,5,4,0,2,1]
 |  
 |  to_pair_tuples(self, remove_basepairs=None)
 |      Create a list of tuples corresponding to all of the base pairs in the
 |      structure. Unpaired bases will be shown as being paired with a
 |      nucleotide numbered 0.
 |      
 |      i.e. [(1,5),(2,4),(3,0),(4,2),(5,1)]
 |      
 |      :param remove_basepairs: A list of 2-tuples containing
 |                               basepairs that should be removed
 |  
 |  traverse_graph(self)
 |      Traverse the graph to get the angle types. The angle type depends on
 |      which corners of the stem are connected by the multiloop or internal
 |      loop.
 |      
 |      :returns: A list of triples (stem, loop, stem)
 |  
 |  ----------------------------------------------------------------------
 |  Class methods inherited from forgi.graph.bulge_graph.BulgeGraph:
 |  
 |  from_bg_file(bg_file) from builtins.type
 |      Load a BulgeGraph from a file containing a text-based representation.
 |      
 |      :param bg_file: The filename.
 |      :return: A bulge Graph.
 |  
 |  from_bpseq_str(bpseq_str, breakpoints=(), name=None, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
 |      Create the graph from a string listing the base pairs.
 |      
 |      The string should be formatted like so:
 |      
 |          1 G 115
 |          2 A 0
 |          3 A 0
 |          4 U 0
 |          5 U 112
 |          6 G 111
 |      
 |      :param bpseq_str: The string, containing newline characters.
 |      :param breakpoints: A list of positions, after which there is a backbone break.
 |      :return: A new BulgeGraph object.
 |  
 |  from_ct_string(ct_string, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
 |      Create the graph from a string holding a connectivity table.
 |      See http://x3dna.org/highlights/dssr-derived-secondary-structure-in-ct-format
 |  
 |  from_dotbracket(dotbracket_str, seq=None, name=None, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
 |      Create a BulgeGraph object from a dotbracket string.
 |      
 |      :param dotbracket_str: A string
 |      :param seq: A string, with the same length as the dotbracket string,
 |                  a forgi.graph.sequence.Sequence instance or None.
 |                  If it is None, the sequence will be all 'N's
 |      :param name: Optional string to use as molecule name.
 |  
 |  from_fasta(filename, dissolve_length_one_stems=False) from builtins.type
 |      Return a list of BulgeGraphs from a fasta file.
 |  
 |  from_fasta_text(fasta_text, dissolve_length_one_stems=False, remove_pseudoknots=False) from builtins.type
 |      Create one or more Bulge Graphs from some fasta text.
 |      
 |      :returns: A list of BulgeGraphs
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties inherited from forgi.graph.bulge_graph.BulgeGraph:
 |  
 |  backbone_breaks_after
 |  
 |  junctions
 |      Get all regular multiloops of this structure.
 |      
 |      :return: A list of tuples of multiloop segments.
 |               Each tuple contains the segments of one regular
 |               (i.e. not pseudoknotted) multiloop.
 |  
 |  rods
 |  
 |  seq
 |  
 |  seq_length
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from forgi.graph._basegraph.BaseGraph:
 |  
 |  connections(self, bulge)
 |      :param g: Graph-like: A BulgeGraph or BulgeGraphConstruction.
 |  
 |  define_range_iterator(self, node, adjacent=False)
 |      Return the ranges of the nucleotides in the define.
 |      
 |      In other words, if a define contains the following: [1,2,7,8]
 |      The ranges will be [1,2] and [7,8].
 |      
 |      :param adjacent: Use the nucleotides in the neighboring element which
 |                       connect to this element as the range starts and ends.
 |      :return: A list of two-element lists
 |  
 |  flanking_nucleotides(self, d)
 |      Return the nucleotides directly flanking an element.
 |      
 |      :param d: the name of the element
 |      :return: a list of nucleotides
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from forgi.graph._basegraph.BaseGraph:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

In [ ]: