Задание 1. Предсказание вторичной структуры заданной (тРНК PDB_ID 1gts) GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA
einverted
In [82]:
! echo "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA" > rna.seq
! einverted -sequence rna.seq -gap 12 -threshold 2 -match 3 -mismatch -3 -outfile outfile -outseq seqout
Find inverted repeats in nucleotide sequences
Перебрав более 50 разных вариантов параметров einverted получили один стебель: 1-6 и 64-69 (акцепторный стебель)
ViennaRNA
In [1]:
import RNA
seq = "GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA"
# create fold_compound data structure (required for all subsequently applied algorithms)
fc = RNA.fold_compound(seq)
# compute MFE and MFE structure
(mfe_struct, mfe) = fc.mfe()
# rescale Boltzmann factors for partition function computation
fc.exp_params_rescale(mfe)
# compute partition function
(pp, pf) = fc.pf()
# compute MEA structure
(MEA_struct, MEA) = fc.MEA()
# compute free energy of MEA structure
MEA_en = fc.eval_structure(MEA_struct)
# print everything like RNAfold -p --MEA
print("%s\n%s (%6.2f)" % (seq, mfe_struct, mfe))
print("%s [%6.2f]" % (pp, pf))
print("%s {%6.2f MEA=%.2f}" % (MEA_struct, MEA_en, MEA))
print(" frequency of mfe structure in ensemble %g; ensemble diversity %-6.2f" % (fc.pr_structure(mfe_struct), fc.mean_bp_distance()))
GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA
((((((..(((.........))).(((((.......))))).....(((((.......)))))))))))..... (-27.30)
(((((({,(({..,,,,...}}}.(((((.......))))).....|((((.......)))))))))))..... [-28.32]
((((((..(((.........))).(((((.......))))).....(((((.......)))))))))))..... {-27.30 MEA=59.93}
frequency of mfe structure in ensemble 0.190896; ensemble diversity 14.06
In [2]:
RNA.svg_rna_plot(seq, MEA_struct, ssfile='ggg.svg' )
Out[2]:
1
In [3]:
from IPython.display import SVG
SVG('ggg.svg')
Out[3]:
In [4]:
#вторичная структура в скобочной последовательности
#GGGGUAUCGCCAAGCGGUAAGGCACCGGAUUCUGAUUCCGGCAUUCCGAGGUUCGAAUCCUCGUACCCCAGCCA
#((((((..(((.........))).(((((.......))))).....(((((.......))))))))))).....
seq="((((((..(((.........))).(((((.......))))).....(((((.......)))))))))))....."
seq_s=""
seq_n=[]
for i in range(len(seq)):
if seq[i]=="(" or seq[i]==")":
seq_s+=seq[i]
seq_n.append(i)
while len(seq_s)!=2:
for i in range(len(seq_s)):
if seq_s[i]=="(" and seq_s[i+1]==")":
print(seq_n[i]+1, seq_n[i+1]+1)
seq_n.pop(i)
seq_n.pop(i)
seq_s=seq_s[:i]+seq_s[i+2:]
break
print(seq_n[0]+1, seq_n[1]+1)
11 21 10 22 9 23 29 37 28 38 27 39 26 40 25 41 51 59 50 60 49 61 48 62 47 63 6 64 5 65 4 66 3 67 2 68 1 69
Далее в таблице представленна информация о предсказанных структурах 1gts разными программами
| find_pair | einverted | по алгоритму Зукера | |
|---|---|---|---|
| Акцепторный стебель | 1-6 и 65-74 | 1-6 и 64-69 | 1-6 и 64-69 |
| D-стебель | 9-24 | - | 9-23 |
| T-стебель | 48-64 | - | 47-63 |
| Антикодоновый стебель | 25-43 | - | 25-41 |
| Общее число канонических пар нуклеотидов | 28 | 6 | 19 |