MAST - Motif Alignment and Search Tool
MAST version 4.3.0 (Release date: Sat Sep 26 01:51:56 PDT 2009)
For further information on how to interpret these results or to get
a copy of the MAST software please access http://meme.nbcr.net.
REFERENCE
If you use this program in your research, please cite:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology
searches", Bioinformatics, 14(48-54), 1998.
DATABASE AND MOTIFS
DATABASE ./seed.fasta (peptide)
Last updated on Tue May 1 16:58:21 2012
Database contains 9 sequences, 3824 residues
MOTIFS ./memeout.txt (peptide)
MOTIF WIDTH BEST POSSIBLE MATCH
----- ----- -------------------
1 27 HQSMPLHDLLKPMMKKSNNMHAEMLFK
2 29 HDKSPGWPWNDMYMCFNAPPAAANIDNNC
3 41 YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF
PAIRWISE MOTIF CORRELATIONS:
MOTIF 1 2
----- ----- -----
2 0.10
3 0.12 0.10
No overly similar pairs (correlation > 0.60) found.
Random model letter frequencies (from non-redundant database):
A 0.073 C 0.018 D 0.052 E 0.062 F 0.040 G 0.069 H 0.022 I 0.056 K 0.058
L 0.092 M 0.023 N 0.046 P 0.051 Q 0.041 R 0.052 S 0.074 T 0.059 V 0.064
W 0.013 Y 0.033
SECTION I: HIGH-SCORING SEQUENCES
- Each of the following 9 sequences has E-value less than 10.
- The E-value of a sequence is the expected number of sequences
in a random database of the same size that would match the motifs as
well as the sequence does and is equal to the combined p-value of the
sequence times the number of sequences in the database.
- The combined p-value of a sequence measures the strength of the
match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif
to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
- The sequence p-value of a score is defined as the
probability of a random sequence of the same length containing
some match with as good or better a score.
- The score for the match of a position in a sequence to a motif
is computed by by summing the appropriate entry from each column of
the position-dependent scoring matrix that represents the motif.
- Sequences shorter than one or more of the motifs are skipped.
- The table is sorted by increasing E-value.
Links | Sequence Name | Description | E-value | Length
|
---|
| DACB_ECOLI/19-469
|
| 3.6e-86
| 451
|
| DACB_HAEIN/26-472
|
| 1.1e-84
| 447
|
| DACC_BACSU/36-491
|
| 1.2e-83
| 456
|
| DAC_ACTSP/54-508
|
| 1.2e-60
| 455
|
| Q9Z541_STRCO/78-446
|
| 1e-14
| 369
|
| O85665_NEIGO/13-461
|
| 3.8e-08
| 449
|
| O06380_MYCTU/72-454
|
| 1.6e-05
| 383
|
| O69539_MYCLE/72-454
|
| 2.1e-05
| 383
|
| Q55728_SYNY3/46-476
|
| 0.00052
| 431
|
SECTION II: MOTIF DIAGRAMS
- The ordering and spacing of all non-overlapping motif occurrences
are shown for each high-scoring sequence listed in Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001.
- The POSITION p-value of a match is the probability of
a single random subsequence of the length of the motif
scoring at least as well as the observed match.
- For each sequence, all motif occurrences are shown unless there
are overlaps. In that case, a motif occurrence is shown only if its
p-value is less than the product of the p-values of the other
(lower-numbered) motif occurrences that it overlaps.
- The table also shows the E-value of each sequence.
- Spacers and motif occurences are indicated by
- Spacers and motif occurences are indicated by
- Spacers and motif occurences are indicated by
- occurrence of motif `n' with p-value less than 0.0001.
- Sequences longer than 1000 are not shown to scale and are indicated by thicker lines.
Links | Name | Expect | Motifs
|
---|
| DACB_ECOLI/19-469
| 3.6e-86
|
|
| DACB_HAEIN/26-472
| 1.1e-84
|
|
| DACC_BACSU/36-491
| 1.2e-83
|
|
| DAC_ACTSP/54-508
| 1.2e-60
|
|
| Q9Z541_STRCO/78-446
| 1e-14
|
|
| O85665_NEIGO/13-461
| 3.8e-08
|
|
| O06380_MYCTU/72-454
| 1.6e-05
|
|
| O69539_MYCLE/72-454
| 2.1e-05
|
|
| Q55728_SYNY3/46-476
| 0.00052
|
|
SCALE
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
1 |
25 |
50 |
75 |
100 |
125 |
150 |
175 |
200 |
225 |
250 |
275 |
300 |
325 |
350 |
375 |
400 |
425 |
|
---|
SECTION III: ANNOTATED SEQUENCES
- The positions and p-values of the non-overlapping motif occurrences
are shown above the actual sequence for each of the high-scoring
sequences from Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001 as
defined in Section II.
- For each sequence, the first line specifies the name of the sequence.
- The second (and possibly more) lines give a description of the
sequence.
- Following the description line(s) is a line giving the length,
combined p-value, and E-value of the sequence as defined in Section I.
- The next line reproduces the motif diagram from Section II.
- The entire sequence is printed on the following lines.
- Motif occurrences are indicated directly above their positions in the
sequence on lines showing
- the motif number of the occurrence,
- the position p-value of the occurrence,
- the best possible match to the motif, and
- columns whose match to the motif has a positive score (indicated
by a plus sign).
DACB_ECOLI/19-469
LENGTH = 451 COMBINED P-VALUE = 3.95e-87 E-VALUE = 3.6e-86
DIAGRAM: 83-[3]-2-[2]-116-[1]-153
[3] [2]
2.5e-38 3.3e-35
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF HDKSPGWPWNDMYMCFNAPPAAAN
+++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++
76 NGVLKGDLVARFGADPTLKRQDIRNMVATLKKSGVNQIDGNVLIDTSIFASHDKAPGWPWNDMTQCFSAPPAAAI
IDNNC
+++++
151 VDRNCFSVSLYSAPKPGDMAFIRVASYYPVTMFSQVRTLPRGSAEAQYCELDVVPGDLNRFTLTGCLPQRSEPLP
[1]
3.0e-27
HQSMPLHDLLKPMMKKSNNMHAEMLFK
+++++++++++++++++++++++++++
226 LAFAVQDGASYAGAILKDELKQAGITWSGTLLRQTQVNEPGTVVASKQSAPLHDLLKIMLKKSDNMIADTVFRMI
DACB_HAEIN/26-472
LENGTH = 447 COMBINED P-VALUE = 1.21e-85 E-VALUE = 1.1e-84
DIAGRAM: 83-[3]-2-[2]-113-[1]-152
[3] [2]
3.3e-37 8.4e-34
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF HDKSPGWPWNDMYMCFNAPPAAAN
+++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++++
76 NGNLDGNLIVRFTGDPDLTRGQLYSLLAELKKQGIKKINGDLVLDTSVFSSHDRGLGWIWNDLTMCFNSPPAAAN
IDNNC
+++++
151 IDNNCFYAELDANKNPGEIVKINVPAQFPIQVFGQVYVADSNEAPYCQLDVVVHDNNRYQVKGCLARQYKPFGLS
[1]
2.8e-28
HQSMPLHDLLKPMMKKSNNMHAEMLFK
+++++++++++++++++++++++++++
226 FAVQNTDAYAAEIIQRQLRQLGIEFNGKVLLPQKPQQGQLLAKHLSKPLPDLLKKMMKKSDNQIADSLFRAVAFN
DACC_BACSU/36-491
LENGTH = 456 COMBINED P-VALUE = 1.36e-84 E-VALUE = 1.2e-83
DIAGRAM: 85-[3]-2-[2]-119-[1]-153
[3] [2]
5.0e-41 8.5e-29
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF HDKSPGWPWNDMYMCFNAPPAA
+++++++++++++++++++++++++++++++++++++++++ ++++++++++++++++++++++
76 LKGKKLNGNLYLKGKGDPTLLPSDFDKMAEILKHSGVKVIKGNLIGDDTWHDDMRLSPDMPWSDEYTYYGAPISA
ANIDNNC
+++++++
151 LTASPNEDYDAGTVIVEVTPNQKEGEEPAVSVSPKTDYITIKNDAKTTAAGSEKDLTIEREHGTNTITIEGSVPV
[1]
2.0e-28
HQSMPLHDLLKPMMKKSNNMHAEM
++++++++++++++++++++++++
226 DANKTKEWISVWEPAGYALDLFKQSLKKQGITVKGDIKTGEAPSSSDVLLSHRSMPLSKLFVPFMKLSNNGHAEV
LFK
+++
301 LVKEMGKVKKGEGSWEKGLEVLNSTLPEFGVDSKSLVLRDGSGISHIDAVSSDQLSQLLYDIQDQSWFSAYLNSL
DAC_ACTSP/54-508
LENGTH = 455 COMBINED P-VALUE = 1.37e-61 E-VALUE = 1.2e-60
DIAGRAM: 84-[3]-2-[2]-121-[1]-151
[3] [2]
1.6e-37 4.0e-09
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF HDKSPGWPWNDMYMCFNAPPAAA
+++++++++++++++++++++++++++++++++++++++++ ++ ++ ++ +++ ++++
76 GRRGEVQDLYLVGRGDPTLSAEDLDAMAAEVAASGVRTVRGDLYADDTWFDSERLVDDWWPEDEPYAYSAQISAL
NIDNNC
++ +
151 TVAHGERFDTGVTEVSVTPAAEGEPADVDLGAAEGYAELDNRAVTGAAGSANTLVIDRPVGTNTIAVTGSLPADA
[1]
2.5e-28
HQSMPLHDLLKPMMKKSNNMHAE
+++++++++++++++++++++++
226 APVTALRTVDEPAALAGHLFEEALESNGVTVKGDVGLGGVPADWQDAEVLADHTSAELSEILVPFMKFSNNGHAE
MLFK
++++
301 MLVKSIGQETAGAGTWDAGLVGVEEALSGLGVDTAGLVLNDGSGLSRGNLVTADTVVDLLGQAGSAPWAQTWSAS
Q9Z541_STRCO/78-446
LENGTH = 369 COMBINED P-VALUE = 1.13e-15 E-VALUE = 1e-14
DIAGRAM: 75-[3]-73-[1]-153
[3]
4.2e-06
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF
++++++ +++ ++ ++ + + + ++
76 TLVGGGDRTLTGEDVAELARTAADGLRAAGRTDVQVRVDDSLFADPSLAEGWNEGYYPTEVAPVRSLVVDGAAVQ
[1]
3.7e-18
HQSMPLHDLLKPMMKKSNNMHAEMLFK
+ +++++++ +++++ +++ +++++ +
151 DTSIDAGKVFAKKLAAQGITVTGEVGRQTAKQSDVPVAQHKSAPLSDIVKKMLKTSDNNIAETLLRMTAVELGKP
O85665_NEIGO/13-461
LENGTH = 449 COMBINED P-VALUE = 4.21e-09 E-VALUE = 3.8e-08
DIAGRAM: 85-[3]-144-[1]-152
[3]
1.0e-06
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF
+ + +++ + + + + +++ + + + ++ +
76 VNDGTLDGNLYWAGSGDPVFNQENLLAVQRQLRDKGIRNITGRLMLDHSLWGEVGSPDHFEADSGSPFMTPPNPT
[1]
5.1e-10
HQSMPLHDLLKPMMKKSNNMHAEMLFK
+++ +++ + + +++ ++ +++
226 VGVRMFALDELIRQTFTNRWLLGGGRISDGIGIADTPEGAHTLAVAHSKPMKEILTDMNKRSDNLIARSVFLKLG
O06380_MYCTU/72-454
LENGTH = 383 COMBINED P-VALUE = 1.81e-06 E-VALUE = 1.6e-05
DIAGRAM: 131-[2]-47-[1]-149
[2]
4.9e-06
HDKSPGWPWNDMYMCFNAP
++++ + ++
76 GPVVLVGAGDPTLSAAPPGQDTWYHGAARIGDLVEQIRRSGVTPTAVQVDASAFSGPTMAPGWDPADIDNGDIAP
[1]
4.0e-08
PAAANIDNNC HQSMPLHDLLKPMMKKSN
+ ++ ++ +++++ + ++ ++
151 IEAAMIDAGRIQPTTVNSRRSRTPALDAGRELAKALGLDPAAVTIASAPAGARQLAVVQSAPLIQRLSQMMNASD
NMHAEMLFK
+ ++ +
226 NVMAECIGREVAVAINRPQSFSGAVDAVTSRLNTAHIDTAGAALVDSSGLSLDNRLTARTLDATMQAAAGPDQPA
O69539_MYCLE/72-454
LENGTH = 383 COMBINED P-VALUE = 2.39e-06 E-VALUE = 2.1e-05
DIAGRAM: 131-[2]-47-[1]-149
[2]
3.2e-05
HDKSPGWPWNDMYMCFNAP
+ ++ + + ++
76 GPVVLVGAGDPTLSAASPDQSTWYRGAPRISDLVEQVRRSGVTPTAVQVDTSLFTGPTMAQGWDPADVDNGYTAP
[1]
4.0e-08
PAAANIDNNC HQSMPLHDLLKPMMKKSN
+ + ++ +++++ + ++ ++
151 IESAMIDAGRIQPTTVKSRRSRTPALDAGRELAKALGVAPDAVTIVKASSGARQLAVVQSAPLVQRLSEMMDNSD
NMHAEMLFK
+ ++ +
226 NVLAECIGREVAAAINRPLSFAGAVDAVTNRLGTAHIDTTGAALVDSSGLSVNNRLTAKTLGGAVQAAAGPDQPV
Q55728_SYNY3/46-476
LENGTH = 431 COMBINED P-VALUE = 5.81e-05 E-VALUE = 0.00052
DIAGRAM: 78-[3]-1-[2]-114-[1]-141
[3] [2]
2.4e-05 8.9e-05
YLRFKGDPTLKRQDFYNMAAELKHSGVKQINGNLYIDTTWF HDKSPGWPWNDMYMCFNAPPAAANIDNNC
+ +++ ++ + + ++ ++ + + ++ + + + ++ + + ++ +
76 KKVQVISSGDPSFDVDDLTAIAKGLKDRGVTAIEELELVDTIAPQDYQRPSWEWDDLHYGYAPPVNGAILTGNQV
[1]
3.6e-06
HQSMPLHDLLKPMMKKSNNMHAEMLFK
++ ++ + + +++ ++ +
226 QQYFLASLQQKLAEQGISVSTSIVSANTKAIAPAPLLALTSPPLWTLIKTVNQDSNNLYAEALLNAIQPPSQATD
Debugging Information
CPU: kodomo.fbb.msu.ru
Time 0.016001 secs.
mast ./memeout.txt -d ./seed.fasta -ev 10.000000 -mt 0.000100
Button Help
Links to Entrez database at NCBI
Links to sequence scores (section I)
Links to motif diagrams (section II)
Links to sequence/motif annotated alignments (section III)
This information