******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 5.3.3 (Release date: Sun Feb 7 15:39:52 2021 -0800) For further information on how to interpret these results please access https://meme-suite.org/meme. To get a copy of the MEME Suite software please access https://meme-suite.org. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** PRIMARY SEQUENCES= upstreams.fasta CONTROL SEQUENCES= --none-- ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ ORF1ab_upstream|[:264]-f 1.0000 265 S_upstream|[21461:21561] 1.0000 101 ORF3a_upstream|[25291:25 1.0000 101 E_upstream|[26143:26243] 1.0000 101 M_upstream|[26421:26521] 1.0000 101 ORF6_upstream|[27100:272 1.0000 101 ORF7a_upstream|[27292:27 1.0000 101 ORF7b_upstream|[27654:27 1.0000 101 ORF8_upstream|[27792:278 1.0000 101 N_upstream|[28172:28272] 1.0000 101 ORF10_upstream|[29456:29 1.0000 101 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme upstreams.fasta -dna -oc . -nostatus -time 14400 -mod zoops -nmotifs 3 -minw 6 -maxw 10 -objfun classic -bfile BM_SARS-CoV-2.markov model: mod= zoops nmotifs= 3 evt= inf objective function: em= E-value of product of p-values starts= E-value of product of p-values strands: + width: minw= 6 maxw= 10 nsites: minsites= 2 maxsites= 11 wnsites= 0.8 theta: spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 trim: wg= 11 ws= 1 endgaps= yes data: n= 1275 N= 11 sample: seed= 0 hsfrac= 0 searchsize= 1275 norand= no csites= 1000 Letter frequencies in dataset: A 0.305 C 0.197 G 0.181 T 0.317 Background letter frequencies (from file BM_SARS-CoV-2.markov): A 0.299 C 0.184 G 0.196 T 0.321 Background model order: 0 ******************************************************************************** ******************************************************************************** MOTIF TAAACGAAC MEME-1 width = 9 sites = 9 llr = 94 E-value = 1.0e-008 ******************************************************************************** -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 Description -------------------------------------------------------------------------------- Simplified A :aa9:19a: pos.-specific C ::::9:1:9 probability G ::::19::1 matrix T a::1::::: bits 2.4 2.2 2.0 * * 1.7 *** ** ** Relative 1.5 *** ** ** Entropy 1.2 ********* (15.0 bits) 1.0 ********* 0.7 ********* 0.5 ********* 0.2 ********* 0.0 --------- Multilevel TAAACGAAC consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- --------- N_upstream|[28172:28272] 85 5.09e-06 AGATTTCATC TAAACGAAC AAACTAAA ORF8_upstream|[27792:278 93 5.09e-06 TTGTCACGCC TAAACGAAC ORF7a_upstream|[27292:27 93 5.09e-06 GGAGATTGAT TAAACGAAC M_upstream|[26421:26521] 49 5.09e-06 TCTTCTGGTC TAAACGAAC TAAATATTAT ORF3a_upstream|[25291:25 91 5.09e-06 ACATTACACA TAAACGAAC TT S_upstream|[21461:21561] 92 5.09e-06 TGTTAACAAC TAAACGAAC A ORF1ab_upstream|[:264]-f 67 5.09e-06 ATCTGTTCTC TAAACGAAC TTTAAAATCT E_upstream|[26143:26243] 25 5.88e-05 GTTAATCCAG TAATGGAAC CAATTTATGA ORF6_upstream|[27100:272 39 1.15e-04 AACTATAAAT TAAACACAG ACCATTCCAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- N_upstream|[28172:28272] 5.1e-06 84_[+1]_8 ORF8_upstream|[27792:278 5.1e-06 92_[+1] ORF7a_upstream|[27292:27 5.1e-06 92_[+1] M_upstream|[26421:26521] 5.1e-06 48_[+1]_44 ORF3a_upstream|[25291:25 5.1e-06 90_[+1]_2 S_upstream|[21461:21561] 5.1e-06 91_[+1]_1 ORF1ab_upstream|[:264]-f 5.1e-06 66_[+1]_190 E_upstream|[26143:26243] 5.9e-05 24_[+1]_68 ORF6_upstream|[27100:272 0.00012 38_[+1]_54 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF TAAACGAAC width=9 seqs=9 N_upstream|[28172:28272] ( 85) TAAACGAAC 1 ORF8_upstream|[27792:278 ( 93) TAAACGAAC 1 ORF7a_upstream|[27292:27 ( 93) TAAACGAAC 1 M_upstream|[26421:26521] ( 49) TAAACGAAC 1 ORF3a_upstream|[25291:25 ( 91) TAAACGAAC 1 S_upstream|[21461:21561] ( 92) TAAACGAAC 1 ORF1ab_upstream|[:264]-f ( 67) TAAACGAAC 1 E_upstream|[26143:26243] ( 25) TAATGGAAC 1 ORF6_upstream|[27100:272 ( 39) TAAACACAG 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 9 n= 1187 bayes= 6.79417 E= 1.0e-008 -982 -982 -982 164 174 -982 -982 -982 174 -982 -982 -982 157 -982 -982 -153 -982 227 -82 -982 -143 -982 218 -982 157 -72 -982 -982 174 -982 -982 -982 -982 227 -82 -982 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 9 nsites= 9 E= 1.0e-008 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.888889 0.000000 0.000000 0.111111 0.000000 0.888889 0.111111 0.000000 0.111111 0.000000 0.888889 0.000000 0.888889 0.111111 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.888889 0.111111 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif TAAACGAAC MEME-1 regular expression -------------------------------------------------------------------------------- TAAACGAAC -------------------------------------------------------------------------------- Time 0.29 secs. ******************************************************************************** ******************************************************************************** MOTIF GCTGCA MEME-2 width = 6 sites = 6 llr = 49 E-value = 2.7e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 Description -------------------------------------------------------------------------------- Simplified A 2::::a pos.-specific C :a::a: probability G 8:2a:: matrix T ::8::: bits 2.4 * ** 2.2 * ** 2.0 * ** 1.7 ** *** Relative 1.5 ** *** Entropy 1.2 ****** (11.7 bits) 1.0 ****** 0.7 ****** 0.5 ****** 0.2 ****** 0.0 ------ Multilevel GCTGCA consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------ ORF10_upstream|[29456:29 6 1.25e-04 TTCCT GCTGCA GATTTGGATG ORF6_upstream|[27100:272 2 1.25e-04 T GCTGCA TACAGTCGCT ORF3a_upstream|[25291:25 29 1.25e-04 TGTGGATCCT GCTGCA AATTTGATGA ORF1ab_upstream|[:264]-f 102 1.25e-04 CTGTCACTCG GCTGCA TGCTTAGTGC ORF7b_upstream|[27654:27 52 2.01e-04 TCTTATTGTT GCGGCA ATAGTGTTTA ORF8_upstream|[27792:278 62 3.91e-04 TCTCACTTGA ACTGCA AGATCATAAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ORF10_upstream|[29456:29 0.00012 5_[+2]_90 ORF6_upstream|[27100:272 0.00012 1_[+2]_94 ORF3a_upstream|[25291:25 0.00012 28_[+2]_67 ORF1ab_upstream|[:264]-f 0.00012 101_[+2]_158 ORF7b_upstream|[27654:27 0.0002 51_[+2]_44 ORF8_upstream|[27792:278 0.00039 61_[+2]_34 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF GCTGCA width=6 seqs=6 ORF10_upstream|[29456:29 ( 6) GCTGCA 1 ORF6_upstream|[27100:272 ( 2) GCTGCA 1 ORF3a_upstream|[25291:25 ( 29) GCTGCA 1 ORF1ab_upstream|[:264]-f ( 102) GCTGCA 1 ORF7b_upstream|[27654:27 ( 52) GCGGCA 1 ORF8_upstream|[27792:278 ( 62) ACTGCA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 6 n= 1220 bayes= 8.10995 E= 2.7e+001 -84 -923 209 -923 -923 244 -923 -923 -923 -923 -23 138 -923 -923 235 -923 -923 244 -923 -923 174 -923 -923 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 6 nsites= 6 E= 2.7e+001 0.166667 0.000000 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.833333 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif GCTGCA MEME-2 regular expression -------------------------------------------------------------------------------- GCTGCA -------------------------------------------------------------------------------- Time 0.57 secs. ******************************************************************************** ******************************************************************************** MOTIF ATGAAGAC MEME-3 width = 8 sites = 6 llr = 52 E-value = 3.2e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 Description -------------------------------------------------------------------------------- Simplified A a::78:a: pos.-specific C :2:22::8 probability G :2a::a:2 matrix T :7:2:::: bits 2.4 * * 2.2 * * 2.0 * * 1.7 * * *** Relative 1.5 * * *** Entropy 1.2 * * **** (12.4 bits) 1.0 * * **** 0.7 *** **** 0.5 ******** 0.2 ******** 0.0 -------- Multilevel ATGAAGAC consensus sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- -------- N_upstream|[28172:28272] 35 1.81e-05 TGTTCGTTCT ATGAAGAC TTTTTAGAGT ORF3a_upstream|[25291:25 41 1.81e-05 TGCAAATTTG ATGAAGAC GACTCTGAGC ORF10_upstream|[29456:29 82 5.08e-05 GCCTAAACTC ATGCAGAC CACACAAGGC ORF7a_upstream|[27292:27 67 1.01e-04 TCTCAATTAG ATGAAGAG CAACCAATGG E_upstream|[26143:26243] 51 1.27e-04 TGATGAACCG ACGACGAC TACTAGCGTG S_upstream|[21461:21561] 28 1.57e-04 TTCTTAGTAA AGGTAGAC TTATAATTAG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- N_upstream|[28172:28272] 1.8e-05 34_[+3]_59 ORF3a_upstream|[25291:25 1.8e-05 40_[+3]_53 ORF10_upstream|[29456:29 5.1e-05 81_[+3]_12 ORF7a_upstream|[27292:27 0.0001 66_[+3]_27 E_upstream|[26143:26243] 0.00013 50_[+3]_43 S_upstream|[21461:21561] 0.00016 27_[+3]_66 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF ATGAAGAC width=8 seqs=6 N_upstream|[28172:28272] ( 35) ATGAAGAC 1 ORF3a_upstream|[25291:25 ( 41) ATGAAGAC 1 ORF10_upstream|[29456:29 ( 82) ATGCAGAC 1 ORF7a_upstream|[27292:27 ( 67) ATGAAGAG 1 E_upstream|[26143:26243] ( 51) ACGACGAC 1 S_upstream|[21461:21561] ( 28) AGGTAGAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 8 n= 1198 bayes= 8.0836 E= 3.2e+002 174 -923 -923 -923 -923 -14 -23 105 -923 -923 235 -923 115 -14 -923 -94 148 -14 -923 -923 -923 -923 235 -923 174 -923 -923 -923 -923 218 -23 -923 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 8 nsites= 6 E= 3.2e+002 1.000000 0.000000 0.000000 0.000000 0.000000 0.166667 0.166667 0.666667 0.000000 0.000000 1.000000 0.000000 0.666667 0.166667 0.000000 0.166667 0.833333 0.166667 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.833333 0.166667 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif ATGAAGAC MEME-3 regular expression -------------------------------------------------------------------------------- ATGAAGAC -------------------------------------------------------------------------------- Time 0.84 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- ORF1ab_upstream|[:264]-f 2.44e-03 66_[+1(5.09e-06)]_190 S_upstream|[21461:21561] 5.69e-04 91_[+1(5.09e-06)]_1 ORF3a_upstream|[25291:25 1.82e-06 40_[+3(1.81e-05)]_42_[+1(5.09e-06)]_\ 2 E_upstream|[26143:26243] 3.12e-03 24_[+1(5.88e-05)]_68 M_upstream|[26421:26521] 1.58e-02 48_[+1(5.09e-06)]_44 ORF6_upstream|[27100:272 1.85e-03 101 ORF7a_upstream|[27292:27 3.76e-04 92_[+1(5.09e-06)] ORF7b_upstream|[27654:27 1.80e-02 101 ORF8_upstream|[27792:278 1.22e-03 92_[+1(5.09e-06)] N_upstream|[28172:28272] 9.08e-05 34_[+3(1.81e-05)]_42_[+1(5.09e-06)]_\ 8 ORF10_upstream|[29456:29 1.55e-03 81_[+3(5.08e-05)]_12 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because requested number of motifs (3) found. ******************************************************************************** CPU: noble-meme.grid.gs.washington.edu ********************************************************************************