******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE=ecoli.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ guaB 1.0000 100 guaA 1.0000 100 purB 1.0000 100 purR 1.0000 100 purL 1.0000 100 purF 1.0000 100 purN 1.0000 100 purA 1.0000 100 purT 1.0000 100 purU 1.0000 100 purK 1.0000 100 purE 1.0000 100 folD 1.0000 100 purD 1.0000 100 purM 1.0000 100 purH 1.0000 100 purC 1.0000 100 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motiv_ecoli_2/meme.fasta -mod zoops -nmotifs 3 -prior dirichlet -revcomp -nostatus -dna -oc motiv_ecoli_2/ model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 17 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 1700 N= 17 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.265 C 0.235 G 0.235 T 0.265 Background letter frequencies (from dataset with add-one prior applied): A 0.265 C 0.235 G 0.235 T 0.265 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 17 sites = 8 llr = 122 E-value = 4.8e-005 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 81::aa6:1:1:4:::: pos.-specific C :6:9::1a:::::9311 probability G :1a1::::95:33:4:: matrix T 31::::3::59841499 bits 2.1 * * 1.9 * ** * 1.7 * ** * 1.5 **** ** * * ** Relative 1.3 **** ** * * ** Entropy 1.0 * **** ***** * ** (22.0 bits) 0.8 * **** ***** * ** 0.6 * ********** * ** 0.4 ***************** 0.2 ***************** 0.0 ----------------- Multilevel ACGCAAACGGTTACGTT consensus T T T GT T sequence G C -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ----------------- purT + 48 2.55e-10 ATAAAGACAC ACGCAAACGTTTTCGTT TATACTGCGC purM + 22 4.31e-09 TAAAGCAGTC TCGCAAACGTTTGCTTT CCCTGTTAGA purL + 9 6.85e-09 TTATTTCC ACGCAAACGGTTTCGTC AGCGCATCAG purE + 14 9.89e-09 TTTCACAGCC ACGCAACCGTTTTCCTT GCTCTCTTTC purR + 41 1.19e-08 AGGTGTGTAA AGGCAAACGTTTACCTT GCGATTTTGC guaB + 32 1.35e-07 AAAGGGGTAG ATGCAATCGGTTACGCT CTGTATAATG purA + 79 1.88e-07 ATCCATTTTT AAGCAAACGGTGATTTT GAAAA purN - 63 1.86e-06 CACGCGTTGT TCGGAATCAGAGGCTTT GATGATACCG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purT 2.6e-10 47_[+1]_36 purM 4.3e-09 21_[+1]_62 purL 6.9e-09 8_[+1]_75 purE 9.9e-09 13_[+1]_70 purR 1.2e-08 40_[+1]_43 guaB 1.3e-07 31_[+1]_52 purA 1.9e-07 78_[+1]_5 purN 1.9e-06 62_[-1]_21 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=17 seqs=8 purT ( 48) ACGCAAACGTTTTCGTT 1 purM ( 22) TCGCAAACGTTTGCTTT 1 purL ( 9) ACGCAAACGGTTTCGTC 1 purE ( 14) ACGCAACCGTTTTCCTT 1 purR ( 41) AGGCAAACGTTTACCTT 1 guaB ( 32) ATGCAATCGGTTACGCT 1 purA ( 79) AAGCAAACGGTGATTTT 1 purN ( 63) TCGGAATCAGAGGCTTT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 17 n= 1428 bayes= 8.21189 E= 4.8e-005 150 -965 -965 -8 -108 141 -91 -108 -965 -965 209 -965 -965 190 -91 -965 191 -965 -965 -965 191 -965 -965 -965 124 -91 -965 -8 -965 209 -965 -965 -108 -965 190 -965 -965 -965 109 92 -108 -965 -965 172 -965 -965 9 150 50 -965 9 50 -965 190 -965 -108 -965 9 67 50 -965 -91 -965 172 -965 -91 -965 172 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 17 nsites= 8 E= 4.8e-005 0.750000 0.000000 0.000000 0.250000 0.125000 0.625000 0.125000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.625000 0.125000 0.000000 0.250000 0.000000 1.000000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.000000 0.000000 0.500000 0.500000 0.125000 0.000000 0.000000 0.875000 0.000000 0.000000 0.250000 0.750000 0.375000 0.000000 0.250000 0.375000 0.000000 0.875000 0.000000 0.125000 0.000000 0.250000 0.375000 0.375000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.000000 0.875000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [AT]CGCAA[AT]CG[GT]T[TG][ATG]C[GTC]TT -------------------------------------------------------------------------------- Time 1.54 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 19 sites = 16 llr = 157 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 5:1:1843735:13:3242 pos.-specific C :43:9:26133::4::13: probability G 3419:11:15114:3111: matrix T 3351:1311:295476738 bits 2.1 1.9 1.7 ** 1.5 ** * Relative 1.3 ** * * Entropy 1.0 *** * * * (14.2 bits) 0.8 *** * * * 0.6 *** *** ** *** * 0.4 ** *** *** ****** * 0.2 ******************* 0.0 ------------------- Multilevel ACTGCAACAGATTCTTTAT consensus GGC TA AC GTGA C sequence TT C A T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ------------------- purL + 26 1.37e-09 CGGTTTCGTC AGCGCATCAGATTCTTTAT AATGACGCCC purB + 19 9.56e-08 GCGGCGGACG TCTGCAACTGATGTTTTCT CGTAATCGCC purR + 7 2.12e-06 CGGCGT ACCGCAACACTTTTGTTGT GCGTAAGGTG purE + 75 4.94e-06 AGCCGAGAGT TGTGCACCACAGGAGTTTT AAGACGC guaB - 4 4.94e-06 TCTACCCCTT TTTGCAAAAAATGCTTGCT ATC purC + 54 6.64e-06 GGCACACCAG ACAGCAAAAGATTTTAAAA CGTTAATTCA purK + 23 6.64e-06 ATGATAAAGA ACTGCACCAGCGTCTGAAT GACTGGCGCA purU - 27 7.31e-06 ACCATTATTG GCCGCAGCACTTTTTAAAT TTTTTACCTG folD + 56 9.66e-06 TGACAAAATA GGCGCATCCCCTTCGATCT ACGTAACAGA purF + 3 1.16e-05 TT TTATCATCAGATGTTTTTT TGATTATCTG purA + 29 1.50e-05 GTCATTTTTG AGTGCAAAAAGTGCTGTAA CTCTGAAAAA purM + 49 2.67e-05 CCCTGTTAGA ATTGCGCCGAATTTTATTT TTCTACCGCA guaA - 63 3.12e-05 TTCCGAGGCA AGTGAAACAGATAATATAA ATCGCCCGAC purH + 28 4.85e-05 TGCCCCGTTA GGGGCGTTAGCTGAGTTTT TCGCGAAAAA purT - 21 8.33e-05 GTGTGTCTTT ATTGCTGATGTTGATTTCT CAACCGAAAA purD - 13 1.36e-04 CTCGTCGGCG GCGGCAATCACTTCGTCAT CACGGATAGA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purL 1.4e-09 25_[+2]_56 purB 9.6e-08 18_[+2]_63 purR 2.1e-06 6_[+2]_75 purE 4.9e-06 74_[+2]_7 guaB 4.9e-06 3_[-2]_78 purC 6.6e-06 53_[+2]_28 purK 6.6e-06 22_[+2]_59 purU 7.3e-06 26_[-2]_55 folD 9.7e-06 55_[+2]_26 purF 1.2e-05 2_[+2]_79 purA 1.5e-05 28_[+2]_53 purM 2.7e-05 48_[+2]_33 guaA 3.1e-05 62_[-2]_19 purH 4.9e-05 27_[+2]_54 purT 8.3e-05 20_[-2]_61 purD 0.00014 12_[-2]_69 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=19 seqs=16 purL ( 26) AGCGCATCAGATTCTTTAT 1 purB ( 19) TCTGCAACTGATGTTTTCT 1 purR ( 7) ACCGCAACACTTTTGTTGT 1 purE ( 75) TGTGCACCACAGGAGTTTT 1 guaB ( 4) TTTGCAAAAAATGCTTGCT 1 purC ( 54) ACAGCAAAAGATTTTAAAA 1 purK ( 23) ACTGCACCAGCGTCTGAAT 1 purU ( 27) GCCGCAGCACTTTTTAAAT 1 folD ( 56) GGCGCATCCCCTTCGATCT 1 purF ( 3) TTATCATCAGATGTTTTTT 1 purA ( 29) AGTGCAAAAAGTGCTGTAA 1 purM ( 49) ATTGCGCCGAATTTTATTT 1 guaA ( 63) AGTGAAACAGATAATATAA 1 purH ( 28) GGGGCGTTAGCTGAGTTTT 1 purT ( 21) ATTGCTGATGTTGATTTCT 1 purD ( 13) GCGGCAATCACTTCGTCAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 19 n= 1394 bayes= 6.42836 E= 1.2e+001 92 -1064 9 -8 -1064 67 67 -8 -108 9 -91 92 -1064 -1064 200 -208 -208 200 -1064 -1064 162 -1064 -91 -208 72 -33 -91 -8 -8 141 -1064 -108 138 -91 -191 -108 -8 9 109 -1064 92 9 -191 -50 -1064 -1064 -91 172 -208 -1064 90 92 -8 67 -1064 50 -1064 -1064 41 138 24 -1064 -91 109 -50 -191 -191 138 72 9 -191 -8 -50 -1064 -1064 162 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 19 nsites= 16 E= 1.2e+001 0.500000 0.000000 0.250000 0.250000 0.000000 0.375000 0.375000 0.250000 0.125000 0.250000 0.125000 0.500000 0.000000 0.000000 0.937500 0.062500 0.062500 0.937500 0.000000 0.000000 0.812500 0.000000 0.125000 0.062500 0.437500 0.187500 0.125000 0.250000 0.250000 0.625000 0.000000 0.125000 0.687500 0.125000 0.062500 0.125000 0.250000 0.250000 0.500000 0.000000 0.500000 0.250000 0.062500 0.187500 0.000000 0.000000 0.125000 0.875000 0.062500 0.000000 0.437500 0.500000 0.250000 0.375000 0.000000 0.375000 0.000000 0.000000 0.312500 0.687500 0.312500 0.000000 0.125000 0.562500 0.187500 0.062500 0.062500 0.687500 0.437500 0.250000 0.062500 0.250000 0.187500 0.000000 0.000000 0.812500 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AGT][CGT][TC]GCA[AT][CA]A[GAC][AC]T[TG][CTA][TG][TA]T[ACT]T -------------------------------------------------------------------------------- Time 2.81 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 10 sites = 3 llr = 39 E-value = 1.1e+004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::::a:a::: pos.-specific C :aaa:::77: probability G a::::a:33a matrix T :::::::::: bits 2.1 **** * * 1.9 ******* * 1.7 ******* * 1.5 ******* * Relative 1.3 ********** Entropy 1.0 ********** (18.7 bits) 0.8 ********** 0.6 ********** 0.4 ********** 0.2 ********** 0.0 ---------- Multilevel GCCCAGACCG consensus GG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ---------- purK + 54 6.53e-07 CTGGCGCAAA GCCCAGACCG ACGAAGTGCT purF + 52 1.96e-06 CAAGTTTCTT GCCCAGAGCG TAAGTGCTCT guaA - 17 1.96e-06 AGAATCAGGA GCCCAGACGG TAGTTCGGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purK 6.5e-07 53_[+3]_37 purF 2e-06 51_[+3]_39 guaA 2e-06 16_[-3]_74 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=10 seqs=3 purK ( 54) GCCCAGACCG 1 purF ( 52) GCCCAGAGCG 1 guaA ( 17) GCCCAGACGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 10 n= 1547 bayes= 9.4557 E= 1.1e+004 -823 -823 209 -823 -823 209 -823 -823 -823 209 -823 -823 -823 209 -823 -823 191 -823 -823 -823 -823 -823 209 -823 191 -823 -823 -823 -823 150 50 -823 -823 150 50 -823 -823 -823 209 -823 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 10 nsites= 3 E= 1.1e+004 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.666667 0.333333 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- GCCCAGA[CG][CG]G -------------------------------------------------------------------------------- Time 3.70 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- guaB 3.00e-07 3_[-2(4.94e-06)]_9_[+1(1.35e-07)]_52 guaA 8.32e-05 16_[-3(1.96e-06)]_36_[-2(3.12e-05)]_19 purB 3.23e-04 18_[+2(9.56e-08)]_63 purR 7.67e-08 6_[+2(2.12e-06)]_15_[+1(1.19e-08)]_43 purL 5.16e-11 8_[+1(6.85e-09)]_[+2(1.37e-09)]_56 purF 4.71e-05 2_[+2(1.16e-05)]_30_[+3(1.96e-06)]_39 purN 6.45e-04 62_[-1(1.86e-06)]_21 purA 1.04e-05 28_[+2(1.50e-05)]_31_[+1(1.88e-07)]_5 purT 1.44e-07 20_[-2(8.33e-05)]_8_[+1(2.55e-10)]_36 purU 1.69e-02 26_[-2(7.31e-06)]_55 purK 1.61e-05 22_[+2(6.64e-06)]_12_[+3(6.53e-07)]_37 purE 1.64e-07 13_[+1(9.89e-09)]_44_[+2(4.94e-06)]_7 folD 2.25e-02 55_[+2(9.66e-06)]_26 purD 2.28e-01 100 purM 6.16e-07 21_[+1(4.31e-09)]_10_[+2(2.67e-05)]_33 purH 2.78e-02 27_[+2(4.85e-05)]_54 purC 9.71e-03 53_[+2(6.64e-06)]_28 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: kodomo.fbb.msu.ru ********************************************************************************