******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= results/meme.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ guaA 1.0000 101 purA 1.0000 101 purC 1.0000 101 purE 1.0000 101 purK 1.0000 101 purL 1.0000 101 purM 1.0000 101 purR 1.0000 101 purT 1.0000 101 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme results/meme.fasta -mod zoops -nmotifs 3 -prior dirichlet -revcomp -nostatus -dna -oc results/ model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 9 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 909 N= 9 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.333 C 0.167 G 0.167 T 0.333 Background letter frequencies (from dataset with add-one prior applied): A 0.333 C 0.167 G 0.167 T 0.333 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 16 sites = 6 llr = 95 E-value = 1.5e-003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 2a2::aaa::2::32: pos.-specific C ::8:7:::82::3277 probability G :::a2:::28:::32: matrix T 8:::2:::::8a72:3 bits 2.6 * 2.3 * 2.1 * 1.8 ** ** Relative 1.5 *** ***** * Entropy 1.3 ********* * ** (22.8 bits) 1.0 ************* ** 0.8 ************* ** 0.5 ************* ** 0.3 **************** 0.0 ---------------- Multilevel TACGCAAACGTTTACC consensus CG T sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ---------------- purM + 21 5.49e-10 CAAAATTCCT TACGCAAACGTTTCCC TTTTTACATT purT - 31 1.37e-09 ATTAATTAAA TACGCAAACGTTTGCT ATTTATTCCT purE + 15 1.37e-09 AGCTTTTCCT TACGCAAACGTTTTCC TTAATAAAAA purA - 59 4.02e-08 TCAGGGTGGT TACGTAAACGTTCAGC TAAGTCACGG purC - 13 3.19e-07 TACCACAAGA AAAGCAAACGATTGCT ATTTTTTCTG purK - 34 4.58e-07 GTTTGTTCTT TACGGAAAGCTTCAAC CGCAGCCATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purM 5.5e-10 20_[+1]_65 purT 1.4e-09 30_[-1]_55 purE 1.4e-09 14_[+1]_71 purA 4e-08 58_[-1]_27 purC 3.2e-07 12_[-1]_73 purK 4.6e-07 33_[-1]_52 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=16 seqs=6 purM ( 21) TACGCAAACGTTTCCC 1 purT ( 31) TACGCAAACGTTTGCT 1 purE ( 15) TACGCAAACGTTTTCC 1 purA ( 59) TACGTAAACGTTCAGC 1 purC ( 13) AAAGCAAACGATTGCT 1 purK ( 34) TACGGAAAGCTTCAAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 774 bayes= 8.10553 E= 1.5e-003 -100 -923 -923 132 158 -923 -923 -923 -100 232 -923 -923 -923 -923 258 -923 -923 200 0 -100 158 -923 -923 -923 158 -923 -923 -923 158 -923 -923 -923 -923 232 0 -923 -923 0 232 -923 -100 -923 -923 132 -923 -923 -923 158 -923 100 -923 100 0 0 100 -100 -100 200 0 -923 -923 200 -923 0 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 6 E= 1.5e-003 0.166667 0.000000 0.000000 0.833333 1.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.666667 0.166667 0.166667 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 0.000000 0.166667 0.833333 0.000000 0.166667 0.000000 0.000000 0.833333 0.000000 0.000000 0.000000 1.000000 0.000000 0.333333 0.000000 0.666667 0.333333 0.166667 0.333333 0.166667 0.166667 0.666667 0.166667 0.000000 0.000000 0.666667 0.000000 0.333333 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- TACGCAAACGTT[TC][AG]C[CT] -------------------------------------------------------------------------------- Time 0.52 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 8 sites = 4 llr = 40 E-value = 6.0e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 55:::::: pos.-specific C :::::8:a probability G 55:a53a: matrix T ::a:5::: bits 2.6 * ** 2.3 * ** 2.1 * ** 1.8 * *** Relative 1.5 ** *** Entropy 1.3 ** *** (14.4 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel AATGGCGC consensus GG TG sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- purT - 69 8.42e-06 AATAAAGAAA GATGGCGC TATCATACTA purR - 42 8.42e-06 AACAAAAAAA GGTGTCGC TCTAAACTAG purM + 57 2.40e-05 GTGCTGTTAG AATGGCGC GGATTTTTGA guaA - 67 5.52e-05 TCATTTATTT AGTGTGGC TTTCATATAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purT 8.4e-06 68_[-2]_25 purR 8.4e-06 41_[-2]_52 purM 2.4e-05 56_[+2]_37 guaA 5.5e-05 66_[-2]_27 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=8 seqs=4 purT ( 69) GATGGCGC 1 purR ( 42) GGTGTCGC 1 purM ( 57) AATGGCGC 1 guaA ( 67) AGTGTGGC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 846 bayes= 6.86419 E= 6.0e+003 59 -865 158 -865 59 -865 158 -865 -865 -865 -865 158 -865 -865 258 -865 -865 -865 158 59 -865 216 58 -865 -865 -865 258 -865 -865 258 -865 -865 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 4 E= 6.0e+003 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [AG][AG]TG[GT][CG]GC -------------------------------------------------------------------------------- Time 0.91 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 8 sites = 2 llr = 23 E-value = 1.2e+004 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A ::a:5::: pos.-specific C aa::::aa probability G :::a55:: matrix T :::::5:: bits 2.6 ** * ** 2.3 ** * ** 2.1 ** * ** 1.8 ** * ** Relative 1.5 **** ** Entropy 1.3 **** ** (16.7 bits) 1.0 ******** 0.8 ******** 0.5 ******** 0.3 ******** 0.0 -------- Multilevel CCAGAGCC consensus GT sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- purR + 2 1.21e-06 A CCAGGGCC TATTTATCTT purK + 79 1.08e-05 GTTAGAGAAC CCAGATCC ATCAGAGGAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purR 1.2e-06 1_[+3]_92 purK 1.1e-05 78_[+3]_15 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=8 seqs=2 purR ( 2) CCAGGGCC 1 purK ( 79) CCAGATCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 846 bayes= 8.7211 E= 1.2e+004 -765 258 -765 -765 -765 258 -765 -765 158 -765 -765 -765 -765 -765 258 -765 58 -765 158 -765 -765 -765 158 58 -765 258 -765 -765 -765 258 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 2 E= 1.2e+004 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.000000 0.000000 0.500000 0.500000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- CCAG[AG][GT]CC -------------------------------------------------------------------------------- Time 1.26 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- guaA 8.38e-02 66_[-2(5.52e-05)]_27 purA 7.68e-05 58_[-1(4.02e-08)]_27 purC 9.87e-04 12_[-1(3.19e-07)]_73 purE 2.81e-05 14_[+1(1.37e-09)]_71 purK 1.34e-05 33_[-1(4.58e-07)]_29_[+3(1.08e-05)]_15 purL 7.05e-01 101 purM 6.76e-08 20_[+1(5.49e-10)]_20_[+2(2.40e-05)]_37 purR 3.29e-05 1_[+3(1.21e-06)]_32_[-2(8.42e-06)]_52 purT 7.64e-08 30_[-1(1.37e-09)]_22_[-2(8.42e-06)]_25 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: kodomo.fbb.msu.ru ********************************************************************************