******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.1.1 (Release date: Wed Jan 29 15:00:42 2020 -0800) For further information on how to interpret please access http://meme-suite.org/. To get a copy of the MEME software please access http://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= upstream_forkozak.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ up_S 1.0000 27 up_E 1.0000 29 up_M 1.0000 28 up_NS6 1.0000 28 up_n 1.0000 28 up_NS7a 1.0000 29 up_NS7b 1.0000 28 up_NS7c 1.0000 27 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme upstream_forkozak.fasta -dna -oc . -nostatus -time 18000 -mod zoops -nmotifs 3 -minw 3 -maxw 50 -objfun classic -markov_order 0 model: mod= zoops nmotifs= 3 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 3 maxw= 29 nsites: minsites= 2 maxsites= 8 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 224 N= 8 sample: seed= 0 hsfrac= 0 searchsize= 224 norand= no csites= 1000 Letter frequencies in dataset: A 0.312 C 0.179 G 0.228 T 0.281 Background letter frequencies (from file dataset with add-one prior applied): A 0.312 C 0.179 G 0.228 T 0.281 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF ATGKBMGWSRNCC MEME-1 width = 13 sites = 7 llr = 61 E-value = 2.0e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 Description -------------------------------------------------------------------------------- Simplified A a:1::436:431: pos.-specific C ::::34:141369 probability G ::944:7:63311 matrix T :a:631:3:111: bits 2.5 2.2 2.0 1.7 ** * Relative 1.5 *** * Entropy 1.2 *** * * * (12.7 bits) 1.0 **** * * * 0.7 **** * * * 0.5 ********* ** 0.2 ********* ** 0.0 ------------- Multilevel ATGTGAGAGAACC consensus GCCATCGC sequence T G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------- up_NS7c 12 2.28e-07 TTCTTTTGGA ATGTTCGAGACCC TTT up_M 13 3.82e-06 AGAAATAACT ATGTCTGACGCCC AAG up_NS6 13 1.27e-05 AGTATTTATA ATGTGCAACTGCC TTT up_NS7a 14 2.64e-05 CCAGGCCGTT ATGGCAGAGAAAC GAA up_NS7b 13 8.34e-05 AATAAATTTC ATGGGCACCAGTC AAT up_n 11 1.41e-04 CACCAAACCA ATATGAGTGCTCC GGTTG up_E 14 2.74e-04 AATTAAAGGA ATGGTAGTGGAGG ATT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- up_NS7c 2.3e-07 11_[+1]_3 up_M 3.8e-06 12_[+1]_3 up_NS6 1.3e-05 12_[+1]_3 up_NS7a 2.6e-05 13_[+1]_3 up_NS7b 8.3e-05 12_[+1]_3 up_n 0.00014 10_[+1]_5 up_E 0.00027 13_[+1]_3 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF ATGKBMGWSRNCC width=13 seqs=7 up_NS7c ( 12) ATGTTCGAGACCC 1 up_M ( 13) ATGTCTGACGCCC 1 up_NS6 ( 13) ATGTGCAACTGCC 1 up_NS7a ( 14) ATGGCAGAGAAAC 1 up_NS7b ( 13) ATGGGCACCAGTC 1 up_n ( 11) ATATGAGTGCTCC 1 up_E ( 14) ATGGTAGTGGAGG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 13 n= 128 bayes= 5.38082 E= 2.0e-001 168 -945 -945 -945 -945 -945 -945 183 -113 -945 191 -945 -945 -945 91 102 -945 68 91 2 46 126 -945 -98 -13 -945 165 -945 87 -32 -945 2 -945 126 133 -945 46 -32 33 -98 -13 68 33 -98 -113 168 -67 -98 -945 226 -67 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 13 nsites= 7 E= 2.0e-001 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.142857 0.000000 0.857143 0.000000 0.000000 0.000000 0.428571 0.571429 0.000000 0.285714 0.428571 0.285714 0.428571 0.428571 0.000000 0.142857 0.285714 0.000000 0.714286 0.000000 0.571429 0.142857 0.000000 0.285714 0.000000 0.428571 0.571429 0.000000 0.428571 0.142857 0.285714 0.142857 0.285714 0.285714 0.285714 0.142857 0.142857 0.571429 0.142857 0.142857 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGKBMGWSRNCC MEME-1 regular expression -------------------------------------------------------------------------------- ATG[TG][GCT][AC][GA][AT][GC][AG][ACG]CC -------------------------------------------------------------------------------- Time 0.05 secs. ******************************************************************************** ******************************************************************************** MOTIF GGCCG MEME-2 width = 5 sites = 2 llr = 16 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif GGCCG MEME-2 Description -------------------------------------------------------------------------------- Simplified A ::::: pos.-specific C ::aa: probability G aa::a matrix T ::::: bits 2.5 ** 2.2 ***** 2.0 ***** 1.7 ***** Relative 1.5 ***** Entropy 1.2 ***** (11.4 bits) 1.0 ***** 0.7 ***** 0.5 ***** 0.2 ***** 0.0 ----- Multilevel GGCCG consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----- up_NS7a 7 3.80e-04 TTTCCA GGCCG TTATGGCAGA up_S 2 3.80e-04 A GGCCG CCAGAATGCA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- up_NS7a 0.00038 6_[+2]_18 up_S 0.00038 1_[+2]_21 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GGCCG width=5 seqs=2 up_NS7a ( 7) GGCCG 1 up_S ( 2) GGCCG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 5 n= 192 bayes= 6.56986 E= 2.5e+001 -765 -765 213 -765 -765 -765 213 -765 -765 248 -765 -765 -765 248 -765 -765 -765 -765 213 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 5 nsites= 2 E= 2.5e+001 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GGCCG MEME-2 regular expression -------------------------------------------------------------------------------- GGCCG -------------------------------------------------------------------------------- Time 0.06 secs. ******************************************************************************** ******************************************************************************** MOTIF CCA MEME-3 width = 3 sites = 2 llr = 9 E-value = 2.4e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CCA MEME-3 Description -------------------------------------------------------------------------------- Simplified A ::a pos.-specific C aa: probability G ::: matrix T ::: bits 2.5 ** 2.2 ** 2.0 ** 1.7 *** Relative 1.5 *** Entropy 1.2 *** (6.6 bits) 1.0 *** 0.7 *** 0.5 *** 0.2 *** 0.0 --- Multilevel CCA consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --- up_NS7a 4 1.00e-02 TTT CCA GGCCGTTATG up_S 7 1.00e-02 AGGCCG CCA GAATGCAGAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- up_NS7a 0.01 3_[+3]_23 up_S 0.01 6_[+3]_18 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CCA width=3 seqs=2 up_NS7a ( 4) CCA 1 up_S ( 7) CCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 3 n= 208 bayes= 6.6865 E= 2.4e+002 -765 248 -765 -765 -765 248 -765 -765 167 -765 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 3 nsites= 2 E= 2.4e+002 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCA MEME-3 regular expression -------------------------------------------------------------------------------- CCA -------------------------------------------------------------------------------- Time 0.08 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- up_S 3.94e-02 27 up_E 8.34e-02 29 up_M 4.80e-04 12_[+1(3.82e-06)]_3 up_NS6 2.79e-03 12_[+1(1.27e-05)]_3 up_n 9.35e-03 28 up_NS7a 1.11e-04 13_[+1(2.64e-05)]_3 up_NS7b 1.02e-02 12_[+1(8.34e-05)]_3 up_NS7c 8.40e-05 11_[+1(2.28e-07)]_3 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (3) found. ******************************************************************************** CPU: ip-172-31-1-5 ********************************************************************************