FBB MSU site Main page About me Terms

PSI-BLAST

Protein O05886.4 is a ribosome hibernation promotion factor (HPF). I conducted search with database UniProtKB/Swiss-Prot(swissprot) using PHI-BLAST (Pattern Hit Initiated BLAST). The results you can see in the following table. The second iteration gave a stable number of hits and the third show the grate difference between the worst hit e-value and the best bad hit e-value. So it is likely that hits are a family of homologous proteins.

Iteration Hits above threshold (0,005) The worst good hit The worst hit e-value Hits below threshold The best bad hit e-value
1 20 P17161.1 0.003 P17160.1 0.005
2 28 P9WMA8.1 0.003 B4L535.1 0.073
3 28 P9WMA8.1 3e-19 P33621.1 0.014
4 28 P9WMA8.1 2e-21 P06727.3 0.005

Prosite

In the pr2 I studied enolases of 7 chosen bacteria (proteobacteria). Here the pattern of proteobacteria's enolases was found by Prosite. Scan was conducted with sequence of ENO_PSEAE. The Prosite found only one pattern: ILIKFNQIGSLTET. This pattern is called Enolase signature.

Pattern refinement

The alignment of enolases of chosen bacteria in FASTA format you can see here.

Original pattern in a generalized form (marked by top red line in the picture):
[LIVTMS]-[LIVP]-[LIV]-[KQ]-x-[ND]-Q-[INV]-[GA]-[ST]-[LIVM]-[STL]-[DERKAQG]-[STA]
There are 100% conserved positions at the start of the alignment: 'N' and 'S'. I added them to the pattern on the -2 and -1 positions correspondingly. Then I compared the pattern with the alignment. I changed corresponding sets in each position leaving only those aa which are in alignment. Letter 'x' was changed into [FVI] (these aa are hydrophobic). So I received these pattern:
N-S-[IM]-L-[IV]-K-[FVI]-N-Q-I-G-[ST]-L-T-E-T
In the following table you can see the comparision of original and composed patterns per positions.

Original[LIVTMS][LIVP][LIV][KQ]x[ND]Q[INV][GA][ST][LIVM][STL][DERKAQG][STA]
ComposedNS[IM]L[IV]K[FVI]NQIG[ST]LTET

Using the composed pattern I searched for proteobacteria's enolases by Proposite according the task and saved the list of them (list 1, 392 findings). Then I found really existing ENOs in Proteobacteria advanced in Uniprot and saved the list of them too (list 2, 396 findings). Then I compared them by Excel (VLOOKUP function) and counted these values:
TP = 225
FP = 392 - 225 = 167
FN = 396 - 225 = 171

Term 4

← Pr 5→ Pr 7


© Darya Potanina, 2017