BLAST

Task 1

Used programs: BLAST


Table 1. Hit table results

#NameSubject ids% Identity% PositivesAlignLenMismatchesGap opensQ. startQ. endS. startS. endE-valueBit score% CoverageHomology
1 Group 2 truncated hemoglobin GlbO gi|17366693|sp|Q9CC59.1|TRHBO_MYCLE 53.543 66.14 127 56 1 11 134 2 128 1.77e-42 139 99.2 +
2 Group 2 truncated hemoglobin GlbO gi|61224558|sp|P0A596.1|TRHBO_MYCBO;gi|615880539|sp|P9WN22.1|TRHBO_MYCTO;gi|615880555|sp|P9WN23.1|TRHBO_MYCTU 55.645 68.55 124 52 1 12 132 3 126 2.36e-42 139 96.9 +
3 Group 2 truncated hemoglobin gi|81341886|sp|O31607.1|TRHBO_BACSU 46.154 56.41 117 61 2 15 131 9 123 9.09e-26 97.1 88.6 +
4 Two-on-two hemoglobin-3 gi|75322687|sp|Q67XG0.1|GLB3_ARATH 27.778 50.79 126 87 3 8 130 21 145 1.92e-10 58.5 72 +
5 Group 1 truncated hemoglobin gi|1707915|sp|P52335.1|TRHBN_NOSSN 26.050 45.38 119 84 2 13 131 3 117 9.75e-09 52.8 100.8 +
6 Group 1 truncated hemoglobin gi|232163|sp|Q00812.1|TRHBN_NOSCO 27.273 47.11 121 80 4 13 131 3 117 4.66e-08 51.2 102.5 +
7 Group 1 truncated hemoglobin gi|121274|sp|P17724.1|TRHBN_TETPY 26.923 48.08 104 66 2 12 112 5 101 4.32e-07 48.5 85.95 +
8 Group 1 truncated hemoglobin gi|17366375|sp|P73925.1|TRHBN_SYNY3 30.000 43.33 60 42 0 13 72 3 62 0.037 35.4 48.4 -
9 Methylmalonyl-CoA mutase gi|34395931|sp|P27253.2|SCPA_ECOLI 42.105 52.63 57 26 3 21 74 19 71 2.1 31.2 7.98 -


 
BAC73082.1/1-134Q9CC59.1/1-127P0A596.1/1-124O31607.1/1-115Q67XG0.1/1-125P52335.1/1-115Q00812.1/1-115P17724.1/1-97P52334.1/1-55P27253.2/1-53ConservationQualityConsensus
102030405060708090100110120130MNEIRRGTLQEQTFYEQVGGEETFRRLVHRFYEGVAGDP--LLKPMYPEEDLGPAEERFT-LFLIQYWGGPTTYSEQRGHPRLRMRHAPFAVDRAAHDAWLKHMRVAV---DE-LGLSEEHEHTLWNYLTYAAASMVNTES----------QQSFYDAIGGAETFKAIVSRFYAQVPEDE--ILRELYPADDLAGAEERLR-MFLEQYWGGPRTYSSQRGHPRLRMRHAPFRITAIERDAWLRCMHTAVASIDS-HTLDNEHRRELLDYLEMAAHSLVNSAS-----------KSFYDAVGGAKTFDAIVSRFYAQVAEDE--VLRRVYPEDDLAGAEERLR-MFLEQYWGGPRTYSEQRGHPRLRMRHAPFRISLIERDAWLRCMHTAVASIDS-ETLDDEHRRELLDYLEMAAHSLV--NS--------------YEAIGE-ELLSQLVDTFYERVASHP--LLKPIFPS-DLTETARKQK-QFLTQYLGGPPLYTEEHGHPMLRARHLPFPITNERADAWLSCMKDAM---DH-VGLEGEIREFLFGRLELTARHM---VN-------AIDESNLFDKLGL-QTFINLSTNFYTRVYDDEEEWFQSIFSNSNKEDAIQNQY-EFFVQRMGGPPLYSQRKGHPALIGRHRPFPVTHQAAERWLEHMQNAL---DDSVDIDQDSKIKMMKFFRHTA-FF---LV------------TLYDNIGGQPAIEQVVDELHKRIATDS--LLAPIFAGTDMAKQRNHLV-AFLGQIFEGPKQYGG---RP-MDKTHAGLNLQQPHFDAIAKHLGEAM---AV-RGVSAEDTKAALDRVTNMKGAI---LN------------TLYDNIGGQPAIEQVVDELHKRIATDS--LLAPVFAGTDMVKQRNHLV-AFLAQIFEGPKQYGG---RP-MDKTHAGLNLQQPHFDAIAKHLGERM---AV-RGVSAENTKAALDRVTNMKGAI---LN-----------QTIYEKLGGENAMKAAVPLFYKKVLADE--RVKHFFKNTDMDHQTKQQT-DFLTMLLGGPNHYKG---KN-MTEAHKGMNLQNLHFDAIIENLAATL---KE-LGV----------------------TD------------SLFAKLGGREAVEAAVDKFYNKVVADP--TVSVFFSKTDMKVQRSKQF-AFLAYALG-----------------------------------------------------------------------G------------------------EKTVDSLVHQTAEGI--AIKPLYTEADLDNLEVTGTLPGLPPYVRGPRAT-----------------------------------------------------------MYT---AQ------------21200230-011246612842473422--2740793308620522543-278252633200200---31-10005011021000030111010001---00-012-----------------01---03 MNEIRRG+++EQTLYD+IGG+E+FEALVDRFYKRVAEDEEELLKPIFPETDLA++EER+TLAFL+QY+GGPRTYS+QRGHPRLRMRHAPFN++++HFDAWLKHM++A+ASID+S+GL+AEHR++LLD+L++AA+S+VN+L+

Figure 1. Multiple alignment

Download link


Figures 2 and 3. Screenshots of the alignment with marked blocks of homology.


Aligned length - displays length of alignment;
Identity - displays the required number of identities at a position for it to give a consensus;
Similarity - displays a cut-off for the % of positive scoring matches below which there is no consensus.
Coverage - (Aligned length / Sequence Length)* 100%

Homology


I suggest that first 3 subjects are the most homological: they have small e-values, quite big bit scores values comparing with the others, similar functions, and take part in most 'homology blocks'. The ninth and eight hits was counted as non-homological due to the short region of homology (more gaps), big bit score and E-value data. The seventh is much more conserved (less gaps) than eighth (E-value), although it shows less identity percent than the eighth hit. The fifth and the sixth hits were counted as homological to my protein because of their participation in big part of homology blocks, quite small e-values (comparing with 8,9) and bigger bit scores.

Task 2

Used programs: BLAST

I've decided to use 2 randomly found proteins in the "Browse" section. After a long unsuccessful search through the families, I was able to find an interesting one.


Table 4. Chosen proteins
IDLengthProtein nameOrganism
P4HA1_MOUSE (Q60715)534 aaProlyl 4-hydroxylase subunit alpha-1Mus musculus (Mouse)
A0A0V0VLS6_9BILA (A0A0V0VLS6)969 aaProlyl 4-hydroxylase subunit alpha-1Trichinella sp. T9

Figure 2. Map of alignment.

Map key

[A-Z] stands for a name of the part of line (from A to F for X axis; from Z to T for Y axis)

Same number (e.g. 1-1) means that this part of line is for both axis

" ' " (after a letter) means that this part of line is duplicated

Green circles

Red circles


A0A0V0VLS6 sequence (vertical axis): Z1 Y2 X3 W V4 U2 T
Q60715 sequence (horizontal axis): A1 B C2 D E3 F4
Back to term 2 page 🚶

© Sophia Veselova, 2017.