******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.1.1 (Release date: Wed Jan 29 15:00:42 2020 -0800) For further information on how to interpret please access http://meme-suite.org/. To get a copy of the MEME software please access http://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= NC_022103.1_US100.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ NC_022103.1_US79_PP1ab&P 1.0000 79 NC_022103.1_US100_S_g2 1.0000 100 NC_022103.1_US100_NS3_g3 1.0000 100 NC_022103.1_US100_E_g4 1.0000 100 NC_022103.1_US100_M_g5 1.0000 100 NC_022103.1_US100_N_g6 1.0000 100 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme NC_022103.1_US100.fasta -dna -oc . -nostatus -time 18000 -mod zoops -nmotifs 3 -minw 6 -maxw 50 -objfun classic -markov_order 0 model: mod= zoops nmotifs= 3 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 6 maxw= 50 nsites: minsites= 2 maxsites= 6 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 579 N= 6 sample: seed= 0 hsfrac= 0 searchsize= 579 norand= no csites= 1000 Letter frequencies in dataset: A 0.271 C 0.168 G 0.223 T 0.339 Background letter frequencies (from file dataset with add-one prior applied): A 0.271 C 0.168 G 0.223 T 0.339 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF AACYMDACRAA MEME-1 width = 11 sites = 5 llr = 53 E-value = 1.2e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 Description -------------------------------------------------------------------------------- Simplified A 88::64a:4a8 pos.-specific C ::844::a::2 probability G 2:2::4::6:: matrix T :2:6:2::::: bits 2.6 * 2.3 * 2.1 * 1.8 * ** * Relative 1.5 * ** * Entropy 1.3 * * * ** ** (15.4 bits) 1.0 ***** ***** 0.8 ***** ***** 0.5 *********** 0.3 *********** 0.0 ----------- Multilevel AACTAAACGAA consensus GTGCCG A C sequence T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ----------- NC_022103.1_US100_E_g4 83 6.40e-07 ACAGATGTTC AACTCGACGAA CTTGAAT NC_022103.1_US100_N_g6 90 2.33e-06 ATTTAGTATA AACTAAACAAA NC_022103.1_US100_NS3_g3 64 2.86e-06 CCTAGATTAC AACCATACGAA GCTATTGAAA NC_022103.1_US79_PP1ab&P 61 2.11e-05 GTCACACTTG AAGCCGACAAC TGCTCAGT NC_022103.1_US100_M_g5 89 2.18e-05 TATTATTGAT GTCTAAACGAA A -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- NC_022103.1_US100_E_g4 6.4e-07 82_[+1]_7 NC_022103.1_US100_N_g6 2.3e-06 89_[+1] NC_022103.1_US100_NS3_g3 2.9e-06 63_[+1]_26 NC_022103.1_US79_PP1ab&P 2.1e-05 60_[+1]_8 NC_022103.1_US100_M_g5 2.2e-05 88_[+1]_1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF AACYMDACRAA width=11 seqs=5 NC_022103.1_US100_E_g4 ( 83) AACTCGACGAA 1 NC_022103.1_US100_N_g6 ( 90) AACTAAACAAA 1 NC_022103.1_US100_NS3_g3 ( 64) AACCATACGAA 1 NC_022103.1_US79_PP1ab&P ( 61) AAGCCGACAAC 1 NC_022103.1_US100_M_g5 ( 89) GTCTAAACGAA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 11 n= 519 bayes= 6.93748 E= 1.2e+001 156 -897 -16 -897 156 -897 -897 -76 -897 225 -16 -897 -897 125 -897 82 114 125 -897 -897 56 -897 84 -76 188 -897 -897 -897 -897 258 -897 -897 56 -897 143 -897 188 -897 -897 -897 156 26 -897 -897 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 11 nsites= 5 E= 1.2e+001 0.800000 0.000000 0.200000 0.000000 0.800000 0.000000 0.000000 0.200000 0.000000 0.800000 0.200000 0.000000 0.000000 0.400000 0.000000 0.600000 0.600000 0.400000 0.000000 0.000000 0.400000 0.000000 0.400000 0.200000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.400000 0.000000 0.600000 0.000000 1.000000 0.000000 0.000000 0.000000 0.800000 0.200000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif AACYMDACRAA MEME-1 regular expression -------------------------------------------------------------------------------- [AG][AT][CG][TC][AC][AGT]AC[GA]A[AC] -------------------------------------------------------------------------------- Time 0.14 secs. ******************************************************************************** ******************************************************************************** MOTIF TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 width = 33 sites = 4 llr = 101 E-value = 2.5e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 Description -------------------------------------------------------------------------------- Simplified A ::::5::8::::a88:::355::3:a5a:3::: pos.-specific C ::53:53:3::::33::5:3::::::3::::58 probability G :a::353:5:85:::3333:3:53::3:a5:3: matrix T a:583:533a35:::883533a55a::::3a33 bits 2.6 2.3 2.1 * * 1.8 * * * ** Relative 1.5 ** * * * ** ** * * Entropy 1.3 ** * ** *** * ** ** * * (36.4 bits) 1.0 **** * * ** ***** * ** ** * * 0.8 **** * * ********* ** ** ** *** 0.5 **** ************* * ** ********* 0.3 ********************************* 0.0 --------------------------------- Multilevel TGCTACTAGTGGAAATTCTAATGTTAAAGGTCC consensus TCGGCTC TT CCGGGACG TA C A GT sequence T G T TGTT G G T T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------------------------------- NC_022103.1_US100_N_g6 27 8.37e-15 ATGGCGACTA TGCTACTACTGGACATTCAAATGATAAAGTTCC AGATAGTGAA NC_022103.1_US100_S_g2 26 2.17e-12 TTGTTAGCCT TGTTAGGAGTGGAAAGTTGCTTGTTAGAGGTTC AGGCCCATTA NC_022103.1_US100_E_g4 48 7.64e-12 GGCAGCAAGT TGTCGGCATTGTAAATTCTAATTTTACAGATGT TCAACTCGAC NC_022103.1_US100_NS3_g3 23 1.30e-11 GTGGCTGTTG TGCTTCTTGTTTAACTGGTTGTTGTAAAGGTCC TAGATTACAA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- NC_022103.1_US100_N_g6 8.4e-15 26_[+2]_41 NC_022103.1_US100_S_g2 2.2e-12 25_[+2]_42 NC_022103.1_US100_E_g4 7.6e-12 47_[+2]_20 NC_022103.1_US100_NS3_g3 1.3e-11 22_[+2]_45 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC width=33 seqs=4 NC_022103.1_US100_N_g6 ( 27) TGCTACTACTGGACATTCAAATGATAAAGTTCC 1 NC_022103.1_US100_S_g2 ( 26) TGTTAGGAGTGGAAAGTTGCTTGTTAGAGGTTC 1 NC_022103.1_US100_E_g4 ( 48) TGTCGGCATTGTAAATTCTAATTTTACAGATGT 1 NC_022103.1_US100_NS3_g3 ( 23) TGCTTCTTGTTTAACTGGTTGTTGTAAAGGTCC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 33 n= 387 bayes= 6.5812 E= 2.5e+001 -865 -865 -865 156 -865 -865 216 -865 -865 158 -865 56 -865 58 -865 115 88 -865 17 -44 -865 158 116 -865 -865 58 17 56 147 -865 -865 -44 -865 58 116 -44 -865 -865 -865 156 -865 -865 175 -44 -865 -865 116 56 188 -865 -865 -865 147 58 -865 -865 147 58 -865 -865 -865 -865 17 115 -865 -865 17 115 -865 158 17 -44 -12 -865 17 56 88 58 -865 -44 88 -865 17 -44 -865 -865 -865 156 -865 -865 116 56 -12 -865 17 56 -865 -865 -865 156 188 -865 -865 -865 88 58 17 -865 188 -865 -865 -865 -865 -865 216 -865 -12 -865 116 -44 -865 -865 -865 156 -865 158 17 -44 -865 216 -865 -44 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 33 nsites= 4 E= 2.5e+001 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.000000 0.500000 0.000000 0.250000 0.000000 0.750000 0.500000 0.000000 0.250000 0.250000 0.000000 0.500000 0.500000 0.000000 0.000000 0.250000 0.250000 0.500000 0.750000 0.000000 0.000000 0.250000 0.000000 0.250000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.500000 0.500000 1.000000 0.000000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.750000 0.250000 0.000000 0.000000 0.000000 0.000000 0.250000 0.750000 0.000000 0.000000 0.250000 0.750000 0.000000 0.500000 0.250000 0.250000 0.250000 0.000000 0.250000 0.500000 0.500000 0.250000 0.000000 0.250000 0.500000 0.000000 0.250000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.500000 0.500000 0.250000 0.000000 0.250000 0.500000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.500000 0.250000 0.250000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.250000 0.000000 0.500000 0.250000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.250000 0.250000 0.000000 0.750000 0.000000 0.250000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TGYTDSKABTGKAAATTBDHDTKDTAVAGDTBC MEME-2 regular expression -------------------------------------------------------------------------------- TG[CT][TC][AGT][CG][TCG][AT][GCT]T[GT][GT]A[AC][AC][TG][TG][CGT][TAG][ACT][AGT]T[GT][TAG]TA[ACG]AG[GAT]T[CGT][CT] -------------------------------------------------------------------------------- Time 0.26 secs. ******************************************************************************** ******************************************************************************** MOTIF CCCTYMCTA MEME-3 width = 9 sites = 2 llr = 24 E-value = 2.3e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 Description -------------------------------------------------------------------------------- Simplified A :::::5::a pos.-specific C aaa:55a:: probability G ::::::::: matrix T :::a5::a: bits 2.6 *** * 2.3 *** * 2.1 *** * 1.8 *** * * Relative 1.5 **** *** Entropy 1.3 **** **** (17.6 bits) 1.0 ********* 0.8 ********* 0.5 ********* 0.3 ********* 0.0 --------- Multilevel CCCTCACTA consensus TC sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------- NC_022103.1_US79_PP1ab&P 7 1.82e-06 GTGGAT CCCTCACTA GTTCCGTCTG NC_022103.1_US100_M_g5 72 3.22e-06 CAAATAGAAC CCCTTCCTA TTATTGATGT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- NC_022103.1_US79_PP1ab&P 1.8e-06 6_[+3]_64 NC_022103.1_US100_M_g5 3.2e-06 71_[+3]_20 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF CCCTYMCTA width=9 seqs=2 NC_022103.1_US79_PP1ab&P ( 7) CCCTCACTA 1 NC_022103.1_US100_M_g5 ( 72) CCCTTCCTA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 9 n= 531 bayes= 8.04712 E= 2.3e+002 -765 257 -765 -765 -765 257 -765 -765 -765 257 -765 -765 -765 -765 -765 156 -765 157 -765 56 88 157 -765 -765 -765 257 -765 -765 -765 -765 -765 156 188 -765 -765 -765 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 9 nsites= 2 E= 2.3e+002 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.500000 0.000000 0.500000 0.500000 0.500000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif CCCTYMCTA MEME-3 regular expression -------------------------------------------------------------------------------- CCCT[CT][AC]CTA -------------------------------------------------------------------------------- Time 0.33 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- NC_022103.1_US79_PP1ab&P 4.13e-06 6_[+3(1.82e-06)]_45_[+1(2.11e-05)]_\ 8 NC_022103.1_US100_S_g2 1.81e-08 25_[+2(2.17e-12)]_42 NC_022103.1_US100_NS3_g3 5.48e-11 22_[+2(1.30e-11)]_8_[+1(2.86e-06)]_\ 26 NC_022103.1_US100_E_g4 6.55e-12 47_[+2(7.64e-12)]_2_[+1(6.40e-07)]_\ 7 NC_022103.1_US100_M_g5 1.97e-05 71_[+3(3.22e-06)]_8_[+1(2.18e-05)]_\ 1 NC_022103.1_US100_N_g6 6.55e-14 26_[+2(8.37e-15)]_30_[+1(2.33e-06)] -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (3) found. ******************************************************************************** CPU: ip-172-31-3-161 ********************************************************************************