******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motiv/meme.fasta ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ purT 1.0000 100 purA 1.0000 100 purL 1.0000 100 folD 1.0000 100 purD 1.0000 100 guaA 1.0000 100 purH 1.0000 100 purM 1.0000 100 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motiv/meme.fasta -mod zoops -nmotifs 3 -prior dirichlet -revcomp -nostatus -dna -oc motiv/ model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 8 maxw= 50 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 800 N= 8 strands: + - sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.268 C 0.232 G 0.232 T 0.268 Background letter frequencies (from dataset with add-one prior applied): A 0.268 C 0.232 G 0.232 T 0.268 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 11 sites = 8 llr = 84 E-value = 5.8e-002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 941189a:::: pos.-specific C 16:9:1:9:13 probability G ::9:1::1a4: matrix T ::::1::::58 bits 2.1 * 1.9 * * 1.7 * * 1.5 * ** **** Relative 1.3 * ** **** Entropy 1.1 **** **** * (15.1 bits) 0.8 ********* * 0.6 *********** 0.4 *********** 0.2 *********** 0.0 ----------- Multilevel ACGCAAACGTT consensus A GC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- ----------- purT + 50 2.49e-07 GAAAAGACAT ACGCAAACGTT TTCGTATATA purL + 11 4.65e-07 TTTTATTTCT ACGCAAACGGT TTCGTCGGCG purM - 28 7.52e-07 CTAACAGGGA AAGCAAACGTT TGCGAGCGTG purA + 79 1.00e-06 ATCCATTTTT AAGCAAACGGT GATTTTGAAA folD + 12 4.83e-06 TGCATCGGAA ACACAAACGTT AACTGACAGC guaA - 44 3.85e-05 AAATCGCCCG ACGCGAAGGTC GGGCGAAGAA purD + 33 5.00e-05 GCCGCCGCCG ACGAACACGGC ATTGCGATGA purH - 30 6.35e-05 GTGAAAAACT CAGCTAACGCT CCTTATGGGG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purT 2.5e-07 49_[+1]_40 purL 4.6e-07 10_[+1]_79 purM 7.5e-07 27_[-1]_62 purA 1e-06 78_[+1]_11 folD 4.8e-06 11_[+1]_78 guaA 3.8e-05 43_[-1]_46 purD 5e-05 32_[+1]_57 purH 6.4e-05 29_[-1]_60 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=11 seqs=8 purT ( 50) ACGCAAACGTT 1 purL ( 11) ACGCAAACGGT 1 purM ( 28) AAGCAAACGTT 1 purA ( 79) AAGCAAACGGT 1 folD ( 12) ACACAAACGTT 1 guaA ( 44) ACGCGAAGGTC 1 purD ( 33) ACGAACACGGC 1 purH ( 30) CAGCTAACGCT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 720 bayes= 6.47573 E= 5.8e-002 171 -89 -965 -965 48 143 -965 -965 -110 -965 191 -965 -110 191 -965 -965 148 -965 -89 -110 171 -89 -965 -965 190 -965 -965 -965 -965 191 -89 -965 -965 -965 211 -965 -965 -89 69 90 -965 11 -965 148 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 8 E= 5.8e-002 0.875000 0.125000 0.000000 0.000000 0.375000 0.625000 0.000000 0.000000 0.125000 0.000000 0.875000 0.000000 0.125000 0.875000 0.000000 0.000000 0.750000 0.000000 0.125000 0.125000 0.875000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.125000 0.375000 0.500000 0.000000 0.250000 0.000000 0.750000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- A[CA]GCAAACG[TG][TC] -------------------------------------------------------------------------------- Time 0.34 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 width = 8 sites = 8 llr = 65 E-value = 8.6e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A ::3:::9a pos.-specific C 5:64:a1: probability G 4a15a::: matrix T 1::1:::: bits 2.1 * ** 1.9 * ** * 1.7 * ** * 1.5 * **** Relative 1.3 * **** Entropy 1.1 * **** (11.7 bits) 0.8 ** **** 0.6 ******** 0.4 ******** 0.2 ******** 0.0 -------- Multilevel CGCGGCAA consensus G AC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- -------- purH + 2 3.36e-05 T CGCCGCAA AATGAGAATG purD - 21 3.36e-05 GTTCGTCGGC GGCGGCAA TGACTTCATC purT + 16 3.36e-05 TATATTGCAA GGCGGCAA GAAAAGCAAT guaA - 81 5.77e-05 ACGCTTATTC CGAGGCAA GTGAAACAGA purA - 16 1.09e-04 GCACTCAATC TGCGGCAA ATCCGACCAC purM - 51 1.92e-04 AAGTTAAATT CGGCGCAA TTCTAACAGG purL - 49 2.32e-04 GGGGGGAAAC GGCCGCCA TTATAAAGAA folD - 1 2.62e-04 TTTGTGTTTC CGATGCAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purH 3.4e-05 1_[+2]_91 purD 3.4e-05 20_[-2]_72 purT 3.4e-05 15_[+2]_77 guaA 5.8e-05 80_[-2]_12 purA 0.00011 15_[-2]_77 purM 0.00019 50_[-2]_42 purL 0.00023 48_[-2]_44 folD 0.00026 [-2]_92 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=8 seqs=8 purH ( 2) CGCCGCAA 1 purD ( 21) GGCGGCAA 1 purT ( 16) GGCGGCAA 1 guaA ( 81) CGAGGCAA 1 purA ( 16) TGCGGCAA 1 purM ( 51) CGGCGCAA 1 purL ( 49) GGCCGCCA 1 folD ( 1) CGATGCAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 744 bayes= 6.52356 E= 8.6e+001 -965 111 69 -110 -965 -965 211 -965 -10 143 -89 -965 -965 69 111 -110 -965 -965 211 -965 -965 211 -965 -965 171 -89 -965 -965 190 -965 -965 -965 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 8 E= 8.6e+001 0.000000 0.500000 0.375000 0.125000 0.000000 0.000000 1.000000 0.000000 0.250000 0.625000 0.125000 0.000000 0.000000 0.375000 0.500000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.875000 0.125000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- [CG]G[CA][GC]GCAA -------------------------------------------------------------------------------- Time 0.59 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 width = 15 sites = 5 llr = 68 E-value = 1.6e+003 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 2:::::::::2228: pos.-specific C :2a::8:a484:::: probability G 8::2a28:4::66:: matrix T :8:8::2:224222a bits 2.1 * * * 1.9 * * * * 1.7 * * * * 1.5 * ** * * Relative 1.3 ******** * ** Entropy 1.1 ******** * ** (19.6 bits) 0.8 ******** * ** 0.6 ********** **** 0.4 *************** 0.2 *************** 0.0 --------------- Multilevel GTCTGCGCCCCGGAT consensus AC G GT GTTAAT sequence T ATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Strand Start P-value Site ------------- ------ ----- --------- --------------- purM - 3 2.81e-08 TGCGAGCGTG GTCTGCGCCCTGGTT AT purD - 85 1.92e-07 C GTCTGCTCCCTGAAT TAATGGCGGA guaA + 18 2.74e-07 CCGAACTACC GTCTGGGCTCCTGAT TTTCTTCGCC purL + 25 5.24e-07 AAACGGTTTC GTCGGCGCGTCAGAT TCTTTATAAT purT - 68 1.04e-06 CCCCTTATTT ACCTGCGCGCAGTAT ATACGAAAAC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purM 2.8e-08 2_[-3]_83 purD 1.9e-07 84_[-3]_1 guaA 2.7e-07 17_[+3]_68 purL 5.2e-07 24_[+3]_61 purT 1e-06 67_[-3]_18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=15 seqs=5 purM ( 3) GTCTGCGCCCTGGTT 1 purD ( 85) GTCTGCTCCCTGAAT 1 guaA ( 18) GTCTGGGCTCCTGAT 1 purL ( 25) GTCGGCGCGTCAGAT 1 purT ( 68) ACCTGCGCGCAGTAT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 15 n= 688 bayes= 7.34704 E= 1.6e+003 -42 -897 178 -897 -897 -21 -897 158 -897 211 -897 -897 -897 -897 -21 158 -897 -897 211 -897 -897 178 -21 -897 -897 -897 178 -42 -897 211 -897 -897 -897 78 78 -42 -897 178 -897 -42 -42 78 -897 58 -42 -897 137 -42 -42 -897 137 -42 158 -897 -897 -42 -897 -897 -897 190 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 15 nsites= 5 E= 1.6e+003 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.200000 0.800000 0.000000 0.000000 1.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 1.000000 0.000000 0.000000 0.000000 0.400000 0.400000 0.200000 0.000000 0.800000 0.000000 0.200000 0.200000 0.400000 0.000000 0.400000 0.200000 0.000000 0.600000 0.200000 0.200000 0.000000 0.600000 0.200000 0.800000 0.000000 0.000000 0.200000 0.000000 0.000000 0.000000 1.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][TC]C[TG]G[CG][GT]C[CGT][CT][CTA][GAT][GAT][AT]T -------------------------------------------------------------------------------- Time 0.77 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- purT 1.53e-08 15_[+2(3.36e-05)]_26_[+1(2.49e-07)]_7_[-3(1.04e-06)]_18 purA 2.78e-04 78_[+1(1.00e-06)]_11 purL 8.33e-08 10_[+1(4.65e-07)]_3_[+3(5.24e-07)]_61 folD 2.00e-03 11_[+1(4.83e-06)]_78 purD 4.12e-07 20_[-2(3.36e-05)]_4_[+1(5.00e-05)]_41_[-3(1.92e-07)]_1 guaA 7.31e-07 17_[+3(2.74e-07)]_11_[-1(3.85e-05)]_26_[-2(5.77e-05)]_12 purH 2.21e-03 1_[+2(3.36e-05)]_20_[-1(6.35e-05)]_18_[+1(6.35e-05)]_31 purM 7.47e-09 2_[-3(2.81e-08)]_10_[-1(7.52e-07)]_62 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: kodomo ********************************************************************************