******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= res_pr5/meme.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ purT 1.0000 100 purL 1.0000 100 purA 1.0000 100 guaB 1.0000 100 purF 1.0000 100 folD 1.0000 100 purD 1.0000 100 guaA 1.0000 100 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme res_pr5/meme.fasta -mod zoops -nmotifs 3 -prior dirichlet -revcomp -nostatus -dna -oc res_pr5/ model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 800 N= 8 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.354 C 0.146 G 0.146 T 0.354 Background letter frequencies (from dataset with add-one prior applied): A 0.353 C 0.147 G 0.147 T 0.353 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 19 sites = 7 llr = 93 E-value = 4.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 96:1:7994696a31:734 pos.-specific C ::3::::::1:3::113:: probability G :439:1:1::::::19:76 matrix T 1:4:a11:6311:76:::: bits 2.8 2.5 2.2 * 1.9 * * Relative 1.7 * * * Entropy 1.4 ** * * * (19.2 bits) 1.1 * ** * * **** 0.8 ** ** ** * * **** 0.6 ********* **** **** 0.3 ******************* 0.0 ------------------- Multilevel AATGTAAATAAAATTGAGG consensus GC AT C A CAA sequence G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ------------------- guaB - 46 4.95e-09 GAAAAATAAA AGGGTAAAAAATATTGCGG TCGCATTATA purA + 9 1.39e-08 CAAATAAA AACGTAAATTACATTGAGA TTAATGTAAG folD - 22 1.03e-07 AACGCCTGTA AATGTAAGTCAAATTGAGA TAAATTTAAG purT + 10 1.39e-07 CGTGGTGTG AGTGTGAATAAAAATGAAG ATTTATTACT purL + 3 1.65e-06 AA AGTGTAAATATCAACGCAA ACGTTTTCGT purD - 57 1.99e-06 AAACGATTTT TAGGTATAAAAAATGCAGG TGGCTGATTG purF + 72 2.82e-06 TTATAAATTC AACATTAAATAAATAGAGG ATAAACACGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- guaB 5e-09 45_[-1]_36 purA 1.4e-08 8_[+1]_73 folD 1e-07 21_[-1]_60 purT 1.4e-07 9_[+1]_72 purL 1.6e-06 2_[+1]_79 purD 2e-06 56_[-1]_25 purF 2.8e-06 71_[+1]_10 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=19 seqs=7 guaB ( 46) AGGGTAAAAAATATTGCGG 1 purA ( 9) AACGTAAATTACATTGAGA 1 folD ( 22) AATGTAAGTCAAATTGAGA 1 purT ( 10) AGTGTGAATAAAAATGAAG 1 purL ( 3) AGTGTAAATATCAACGCAA 1 purD ( 57) TAGGTATAAAAAATGCAGG 1 purF ( 72) AACATTAAATAAATAGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 656 bayes= 6.37682 E= 4.3e+002 128 -945 -945 -130 69 -945 154 -945 -945 96 96 28 -130 -945 254 -945 -945 -945 -945 150 101 -945 -4 -130 128 -945 -945 -130 128 -945 -4 -945 28 -945 -945 69 69 -4 -945 -31 128 -945 -945 -130 69 96 -945 -130 150 -945 -945 -945 -31 -945 -945 101 -130 -4 -4 69 -945 -4 254 -945 101 96 -945 -945 -31 -945 228 -945 28 -945 196 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 7 E= 4.3e+002 0.857143 0.000000 0.000000 0.142857 0.571429 0.000000 0.428571 0.000000 0.000000 0.285714 0.285714 0.428571 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.000000 1.000000 0.714286 0.000000 0.142857 0.142857 0.857143 0.000000 0.000000 0.142857 0.857143 0.000000 0.142857 0.000000 0.428571 0.000000 0.000000 0.571429 0.571429 0.142857 0.000000 0.285714 0.857143 0.000000 0.000000 0.142857 0.571429 0.285714 0.000000 0.142857 1.000000 0.000000 0.000000 0.000000 0.285714 0.000000 0.000000 0.714286 0.142857 0.142857 0.142857 0.571429 0.000000 0.142857 0.857143 0.000000 0.714286 0.285714 0.000000 0.000000 0.285714 0.000000 0.714286 0.000000 0.428571 0.000000 0.571429 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[AG][TCG]GTAAA[TA][AT]A[AC]A[TA]TG[AC][GA][GA] -------------------------------------------------------------------------------- Time 0.39 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 8 sites = 3 llr = 33 E-value = 3.2e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A :::::::a pos.-specific C a:7a:::: probability G :a3::3:: matrix T ::::a7a: bits 2.8 ** * 2.5 ** * 2.2 ** * 1.9 **** Relative 1.7 **** Entropy 1.4 ***** ** (15.7 bits) 1.1 ******** 0.8 ******** 0.6 ******** 0.3 ******** 0.0 -------- Multilevel CGCCTTTA consensus G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- folD - 41 3.02e-06 TCGCAAAAAA CGCCTGTA AATGTAAGTC purA - 84 1.03e-05 AGTATTTTC CGCCTTTA AGCTTTGTAT purF + 12 2.05e-05 TTCCACATTT CGGCTTTA TTGTTGAGTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- folD 3e-06 40_[-2]_52 purA 1e-05 83_[-2]_9 purF 2.1e-05 11_[+2]_81 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=8 seqs=3 folD ( 41) CGCCTGTA 1 purA ( 84) CGCCTTTA 1 purF ( 12) CGGCTTTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 744 bayes= 7.60577 E= 3.2e+003 -823 276 -823 -823 -823 -823 276 -823 -823 218 118 -823 -823 276 -823 -823 -823 -823 -823 150 -823 -823 118 91 -823 -823 -823 150 150 -823 -823 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 3 E= 3.2e+003 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.333333 0.666667 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- CG[CG]CT[TG]TA -------------------------------------------------------------------------------- Time 0.63 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 8 sites = 2 llr = 24 E-value = 3.8e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A :5:::::: pos.-specific C 55aaa::: probability G 5::::a:: matrix T ::::::aa bits 2.8 **** 2.5 **** 2.2 **** 1.9 **** Relative 1.7 * **** Entropy 1.4 * ****** (17.0 bits) 1.1 ******** 0.8 ******** 0.6 ******** 0.3 ******** 0.0 -------- Multilevel CACCCGTT consensus GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- purL + 56 2.51e-06 CTTTATAATA GCCCCGTT TTTCGTTTTA purD - 14 8.55e-06 TAACCTGAAT CACCCGTT AAAATCAATG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purL 2.5e-06 55_[+3]_37 purD 8.6e-06 13_[-3]_79 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=8 seqs=2 purL ( 56) GCCCCGTT 1 purD ( 14) CACCCGTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 744 bayes= 8.53528 E= 3.8e+003 -765 176 176 -765 50 176 -765 -765 -765 276 -765 -765 -765 276 -765 -765 -765 276 -765 -765 -765 -765 276 -765 -765 -765 -765 150 -765 -765 -765 150 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 3.8e+003 0.000000 0.500000 0.500000 0.000000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [CG][AC]CCCGTT -------------------------------------------------------------------------------- Time 0.87 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purT 9.84e-04 9_[+1(1.39e-07)]_72 purL 1.21e-05 2_[+1(1.65e-06)]_34_[+3(2.51e-06)]_37 purA 3.12e-07 8_[+1(1.39e-08)]_56_[-2(1.03e-05)]_9 guaB 2.49e-05 45_[-1(4.95e-09)]_36 purF 9.63e-05 11_[+2(2.05e-05)]_52_[+1(2.82e-06)]_10 folD 9.38e-07 21_[-1(1.03e-07)]_[-2(3.02e-06)]_52 purD 1.89e-05 13_[-3(8.55e-06)]_35_[-1(1.99e-06)]_25 guaA 4.19e-02 100 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: kodomo.fbb.msu.ru ********************************************************************************