The name of the sequence database file.
The number of sequences in the database.
The number of residues in the sequence database.
The date of the last modification to the sequence database.
The name of the motif. If the motif has been removed or removal is
recommended to avoid highly similar motifs then it will be displayed
in red text.
The width of the motif. No gaps are allowed in motifs supplied to MAST
as it only works for motifs of a fixed width.
The sequence that would achieve the best possible match score and its
reverse complement for nucleotide motifs.
MAST computes the pairwise correlations between each pair of motifs.
The correlation between two motifs is the maximum sum of Pearson's
correlation coefficients for aligned columns divided by the width of
the shorter motif. The maximum is found by trying all alignments of the
two motifs. Motifs with correlations below 0.60 have little effect on
the accuracy of the combined scores. Pairs of motifs with higher
correlations should be removed from the query. Correlations above the
supplied threshold are shown in red text.
This diagram shows the normal spacing of the motifs specified to MAST.
The name of the sequence. This maybe be linked to search a sequence database for the sequence name.
The E-value of the sequence. For DNA
only; if strands were scored seperately then there will be two
E-values for the sequence seperated by a "/". The score for the
provided sequence will be first and the score for the reverse-complement
will be second.
The block diagram shows the best non-overlapping tiling of motif matches on the sequence.
- The length of the line shows the length of a sequence relative to all the other sequences.
- A block is shown where the positional p-value
of a motif is less (more significant) than the significance threshold which is 0.0001 by default.
- If a significant motif match (as specified above) overlaps other significant motif matches then
it is only displayed as a block if its positional p-value
is less (more significant) then the product of the positional
p-values of the significant matches that it overlaps.
- The position of a block shows where a motif has matched the sequence.
- The width of a block shows the width of the motif relative to the length of the sequence.
- The colour and border of a block identifies the matching motif as in the legend.
- The height of a block gives an indication of the significance of the match as
taller blocks are more significant. The height is calculated to be proportional
to the negative logarithm of the positional p-value,
truncated at the height for a p-value of 1e-10.
- Hovering the mouse cursor over the block causes the display of the motif name
and other details in the hovering text.
- DNA only; blocks displayed above the line are a match on the given DNA, whereas blocks
displayed below the line are matches to the reverse-complement of the given DNA.
- DNA only; when strands are scored separately then blocks may overlap on opposing strands.
The description appearing after the identifier in the fasta file used to specify the sequence.
The combined p-value of the
sequence. DNA only; if strands were scored seperately then there will be
two p-values for the sequence seperated by a "/". The score for
the provided sequence will be first and the score for the
reverse-complement will be second.
This indicates the offset used for translation of the DNA.
The annotated sequence shows a portion of the sequence with the
matching motif sequences displayed above. The displayed portion of the
sequence can be modified by sliding the two buttons below the sequence
block diagram so that the portion you want to see is between the two
needles attached to the buttons. By default the two buttons move
together but you can drag one individually by holding shift before you
start the drag. If the strands were scored seperately then they can't
be both displayed at once due to overlaps and so a radio button offers
the choice of strand to display.
MAST
Motif Alignment & Search Tool
For further information on how to interpret these results or to get a
copy of the MEME software please access
http://meme.nbcr.net.
If you use MAST in your research, please cite the following paper:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology searches",
Bioinformatics, 14(1):48-54, 1998.
[pdf]
Sequence Databases
The following sequence database was supplied to MAST.
Database
|
Sequence Count
|
Residue Count
|
Last Modified
|
SD_all_1.fasta |
4198 |
146930 |
Wed May 27 02:28:20 2015 |
Total |
4198 |
146930 |
|
Motifs
The following motifs were supplied to MAST from "meme.txt" last modified on Wed May 27 02:28:20 2015.
|
|
Best possible match
|
Similarity
|
Motif
|
Width
|
(+) |
(-) |
1 |
1 |
6 |
AGCCAA
|
TTGGCT
|
- |
Top Scoring Sequences
Each of the following 0 sequences has an E-value less than
10.
The motif matches shown have a position p-value less than 0.0001.
Click on the arrow (↧) next to the E-value to view more information about a sequence.
Sequence
|
E-value
|
|
Block Diagram
|
|
|
MAST version
4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000)
Reference
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology searches",
Bioinformatics, 14(1):48-54, 1998.
Command line summary
Background letter frequencies (from non-redundant database):
A: 0.274 C: 0.225 G: 0.225 T: 0.274
Result calculation took 0.029 seconds
show model parameters...
Model parameters
hide model parameters...
Explanation of MAST Results |
Top |
The MAST results consist of
- The inputs to MAST including:
- The sequence databases showing the sequence
and residue counts. [View]
- The motifs showing the name, width, best scoring match
and similarity to other motifs. [View]
-
The nominal order and spacing diagram.
- The search results showing top scoring sequences with
tiling of all of the motifs matches shown for each of the sequences. [View]
- The program details including:
- The version of MAST and the date it was released. [View]
- The reference to cite if you use MAST in your research. [View]
- The command line summary detailing the parameters with which you ran MAST. [View]
- This explanation of how to interpret MAST results.
MAST received the following inputs.
Sequence Databases
This table summarises the sequence databases specified to MAST.
- Database
- The name of the database file.
- Sequence Count
- The number of sequences in the database.
- Residue Count
- The number of residues in the database.
Motifs
Summary of the motifs specified to MAST.
- Name
- The name of the motif. If the motif has been removed or removal is recommended to avoid highly similar motifs
then it will be displayed in red text.
- Width
- The width of the motif. No gaps are allowed in motifs supplied to MAST as it only works for motifs of a fixed width.
- Best possible match
- The sequence that would achieve the best possible match score and its reverse complement for nucleotide motifs.
- Similarity
-
MAST computes the pairwise correlations between each pair of motifs. The correlation between two motifs is the
maximum sum of Pearson's correlation coefficients for aligned columns divided by the width of the shorter motif.
The maximum is found by trying all alignments of the two motifs. Motifs with correlations below 0.60 have little
effect on the accuracy of the combined scores. Pairs of motifs with higher correlations should be removed from
the query. Correlations above the supplied threshold are shown in red text.
Nominal Order and Spacing
This diagram shows the normal spacing of the motifs specified to MAST.
Search Results
MAST provides the following motif search results.
Top Scoring Sequences
This table summarises the top scoring sequences with a Sequence E-value
better than the threshold (default 10). The sequences are sorted by the Sequence
E-value from most to least significant.
- Sequence
- The name of the sequence. This maybe be linked to search a sequence database for the sequence name.
-
E-value
-
The E-value of the sequence. For DNA only; if strands were scored seperately
then there will be two E-values for the sequence seperated by a "/". The score for the provided sequence
will be first and the score for the reverse-complement will be second.
- ↧
-
Click on this to show additional information about the sequence such as a
description, combined p-value and the annotated sequence.
- Block Diagram
-
The block diagram shows the best non-overlapping tiling of motif matches on the sequence.
- The length of the line shows the length of a sequence relative to all the other sequences.
- A block is shown where the positional p-value
of a motif is less (more significant) than the significance threshold which is 0.0001 by default.
- If a significant motif match (as specified above) overlaps other significant motif matches then
it is only displayed as a block if its positional p-value
is less (more significant) then the product of the positional
p-values of the significant matches that it overlaps.
- The position of a block shows where a motif has matched the sequence.
- The width of a block shows the width of the motif relative to the length of the sequence.
- The colour and border of a block identifies the matching motif as in the legend.
- The height of a block gives an indication of the significance of the match as
taller blocks are more significant. The height is calculated to be proportional
to the negative logarithm of the positional p-value,
truncated at the height for a p-value of 1e-10.
- Hovering the mouse cursor over the block causes the display of the motif name
and other details in the hovering text.
- DNA only; blocks displayed above the line are a match on the given DNA, whereas blocks
displayed below the line are matches to the reverse-complement of the given DNA.
- DNA only; when strands are scored separately then blocks may overlap on opposing strands.
Additional Sequence Information
Clicking on the ↧ link expands a box below the sequence with additional information and adds two dragable buttons
below the block diagram.
- Description
- The description appearing after the identifier in the fasta file used to specify the sequence.
- Combined p-value
- The combined p-value of the sequence. DNA only; if strands were scored
seperately then there will be two p-values for the sequence seperated by a "/". The score for the provided sequence
will be first and the score for the reverse-complement will be second.
- Annotated Sequence
-
The annotated sequence shows a portion of the sequence with the matching motif sequences displayed above.
The displayed portion of the sequence can be modified by sliding the two buttons below the sequence block diagram
so that the portion you want to see is between the two needles attached to the buttons. By default the two buttons
move together but you can drag one individually by holding shift before you start the drag. If the strands were
scored seperately then they can't be both displayed at once due to overlaps and so a radio button offers the choice
of strand to display.
Scoring
MAST scores sequences using the following measures.
Position score calculation
The score for the match of a position in a sequence to a motif is computed by by summing the appropriate entry
from each column of the position-dependent scoring matrix that represents the motif. Sequences shorter than
one or more of the motifs are skipped.
Position p-value
The position p-value of a match is the probability of a single random subsequence of the length of the motif
scoring at least as well as the observed match.
Sequence p-value
The sequence p-value of a score is defined as the probability of a random sequence of the same length containing
some match with as good or better a score.
Combined p-value
The combined p-value of a sequence measures the strength of the match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
Sequence E-value
The E-value of a sequence is the expected number of sequences in a random database of the same size that would match
the motifs as well as the sequence does and is equal to the combined p-value of the sequence times the number of
sequences in the database.
ACGTN