******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= meme_pr5/meme.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ purL 1.0000 101 purT 1.0000 101 purA 1.0000 101 folD 1.0000 101 guaA 1.0000 101 purH 1.0000 101 purR 1.0000 101 purM 1.0000 101 purC 1.0000 101 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme meme_pr5/meme.fasta -mod zoops -nmotifs 3 -prior dirichlet -revcomp -nostatus -dna -oc meme_pr5/ model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 909 N= 9 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.279 C 0.221 G 0.221 T 0.279 Background letter frequencies (from dataset with add-one prior applied): A 0.279 C 0.221 G 0.221 T 0.279 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 10 sites = 9 llr = 76 E-value = 1.7e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:41a12::: pos.-specific C ::26::4a19 probability G :312:92:91 matrix T 9721::1::: bits 2.2 * 2.0 * 1.7 ** *** 1.5 ** *** Relative 1.3 * ** *** Entropy 1.1 ** ** *** (12.2 bits) 0.9 ** ** *** 0.7 ** ** *** 0.4 ** *** *** 0.2 ********** 0.0 ---------- Multilevel TTACAGCCGC consensus GCG A sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ---------- purL + 40 1.82e-06 TGAAGTGCTG TTCCAGCCGC TTCGAAGACG purT + 11 5.28e-06 CCCTTTAACA TTACAGACGC AATCGTTTTC folD - 27 1.69e-05 CGGGCCGCGC TGTCAGGCGC ATAATGACGA purA + 44 3.31e-05 TAAAAAGTAC TGAAAGCCGC CGTATGAGAT purC + 15 4.47e-05 TAACAGAGCC TTATAGGCGC ATATGAAAAA purR + 69 5.68e-05 TACTGACCTG TTTCAGCCGG TCAGTTTAGG purH - 75 1.18e-04 TCCCTTGGAT TTGGAGTCGC AGTTTTCTTC guaA - 26 2.14e-04 AATCACTGAC TTCGAGACCC GTTTATTTGC purM + 77 2.91e-04 AAGCTCAAAC AGACAACCGC CGTGGGGACC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purL 1.8e-06 39_[+1]_52 purT 5.3e-06 10_[+1]_81 folD 1.7e-05 26_[-1]_65 purA 3.3e-05 43_[+1]_48 purC 4.5e-05 14_[+1]_77 purR 5.7e-05 68_[+1]_23 purH 0.00012 74_[-1]_17 guaA 0.00021 25_[-1]_66 purM 0.00029 76_[+1]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=10 seqs=9 purL ( 40) TTCCAGCCGC 1 purT ( 11) TTACAGACGC 1 folD ( 27) TGTCAGGCGC 1 purA ( 44) TGAAAGCCGC 1 purC ( 15) TTATAGGCGC 1 purR ( 69) TTTCAGCCGG 1 purH ( 75) TTGGAGTCGC 1 guaA ( 26) TTCGAGACCC 1 purM ( 77) AGACAACCGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 10 n= 828 bayes= 6.50779 E= 1.7e+002 -133 -982 -982 167 -982 -982 59 125 67 1 -99 -33 -133 133 1 -133 184 -982 -982 -982 -133 -982 201 -982 -33 101 1 -133 -982 218 -982 -982 -982 -99 201 -982 -982 201 -99 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 10 nsites= 9 E= 1.7e+002 0.111111 0.000000 0.000000 0.888889 0.000000 0.000000 0.333333 0.666667 0.444444 0.222222 0.111111 0.222222 0.111111 0.555556 0.222222 0.111111 1.000000 0.000000 0.000000 0.000000 0.111111 0.000000 0.888889 0.000000 0.222222 0.444444 0.222222 0.111111 0.000000 1.000000 0.000000 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 0.888889 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- T[TG][ACT][CG]AG[CAG]CGC -------------------------------------------------------------------------------- Time 0.53 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 11 sites = 6 llr = 64 E-value = 4.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :2a2::7:::: pos.-specific C 25:::::::7: probability G :3:8:82:5:a matrix T 8:::a22a53: bits 2.2 * 2.0 * 1.7 * * * * 1.5 **** * * Relative 1.3 * **** * * Entropy 1.1 * **** **** (15.4 bits) 0.9 * **** **** 0.7 *********** 0.4 *********** 0.2 *********** 0.0 ----------- Multilevel TCAGTGATGCG consensus G TT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ----------- guaA + 37 4.46e-07 GGTCTCGAAG TCAGTGATTCG TGCTCGCACC purM + 7 3.44e-06 AGCCAG TAAGTGATTCG GGTGATTGCG purA + 22 4.93e-06 CAAAAATTGT TCAGTTATGCG TTAAAAAGTA purC - 82 7.71e-06 CTTATTACT CCAGTGATGTG ACCATTTTAG purR + 19 1.12e-05 TGTTTCTGTC TGAGTGGTTTG CGGGTAACGC purL + 14 2.37e-05 GTTGCGCCAA TGAATGTTGCG CCCAATGAAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- guaA 4.5e-07 36_[+2]_54 purM 3.4e-06 6_[+2]_84 purA 4.9e-06 21_[+2]_69 purC 7.7e-06 81_[-2]_9 purR 1.1e-05 18_[+2]_72 purL 2.4e-05 13_[+2]_77 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=11 seqs=6 guaA ( 37) TCAGTGATTCG 1 purM ( 7) TAAGTGATTCG 1 purA ( 22) TCAGTTATGCG 1 purC ( 82) CCAGTGATGTG 1 purR ( 19) TGAGTGGTTTG 1 purL ( 14) TGAATGTTGCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 819 bayes= 7.53244 E= 4.2e+003 -923 -40 -923 158 -74 118 59 -923 184 -923 -923 -923 -74 -923 192 -923 -923 -923 -923 184 -923 -923 192 -74 125 -923 -40 -74 -923 -923 -923 184 -923 -923 118 84 -923 159 -923 25 -923 -923 218 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 6 E= 4.2e+003 0.000000 0.166667 0.000000 0.833333 0.166667 0.500000 0.333333 0.000000 1.000000 0.000000 0.000000 0.000000 0.166667 0.000000 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.833333 0.166667 0.666667 0.000000 0.166667 0.166667 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.666667 0.000000 0.333333 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- T[CG]AGTGAT[GT][CT]G -------------------------------------------------------------------------------- Time 0.89 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 11 sites = 2 llr = 29 E-value = 1.3e+004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :::5::::::: pos.-specific C :a:5::5aa5a probability G a:a:aa:::5: matrix T ::::::5:::: bits 2.2 *** ** ** * 2.0 *** ** ** * 1.7 *** ** ** * 1.5 *** ** ** * Relative 1.3 *** ** ** * Entropy 1.1 *********** (20.6 bits) 0.9 *********** 0.7 *********** 0.4 *********** 0.2 *********** 0.0 ----------- Multilevel GCGAGGCCCCC consensus C T G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ----------- folD + 37 1.23e-07 GCGCCTGACA GCGCGGCCCGC TTCTGACAAA purM - 90 6.29e-07 T GCGAGGTCCCC ACGGCGGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- folD 1.2e-07 36_[+3]_54 purM 6.3e-07 89_[-3]_1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=11 seqs=2 folD ( 37) GCGCGGCCCGC 1 purM ( 90) GCGAGGTCCCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 819 bayes= 8.67419 E= 1.3e+004 -765 -765 217 -765 -765 217 -765 -765 -765 -765 217 -765 84 118 -765 -765 -765 -765 217 -765 -765 -765 217 -765 -765 118 -765 84 -765 217 -765 -765 -765 217 -765 -765 -765 118 118 -765 -765 217 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 2 E= 1.3e+004 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GCG[AC]GG[CT]CC[CG]C -------------------------------------------------------------------------------- Time 1.18 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purL 1.37e-05 13_[+2(2.37e-05)]_15_[+1(1.82e-06)]_52 purT 2.75e-02 10_[+1(5.28e-06)]_81 purA 4.59e-04 21_[+2(4.93e-06)]_11_[+1(3.31e-05)]_48 folD 6.91e-06 26_[-1(1.69e-05)]_[+3(1.23e-07)]_54 guaA 1.11e-04 36_[+2(4.46e-07)]_54 purH 1.99e-01 101 purR 2.20e-04 18_[+2(1.12e-05)]_39_[+1(5.68e-05)]_23 purM 7.79e-07 6_[+2(3.44e-06)]_72_[-3(6.29e-07)]_1 purC 8.17e-04 14_[+1(4.47e-05)]_57_[-2(7.71e-06)]_9 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: kodomo.fbb.msu.ru ********************************************************************************