Pfam. Domain Architectures.

Task 1

For this task I've chosen 6 proteins from my previous work (pr10). I've also decided to perform multiple alignment in 4 different services and compare results of any 2 of them.


Table 1. Main information about chosen proteins

Entry nameLengthProtein nameOrganismPhylum
DNAK_THEVO 613 Chaperone protein DnaK Thermoplasma volcanium (strain ATCC 51530 / DSM 4299 / JCM 9571 / NBRC 15438 / GSS1) Archaea
DNAK_LACAC 614 Chaperone protein DnaK Lactobacillus acidophilus (strain ATCC 700396 / NCK56 / N2 / NCFM) Bacteria
DNAK_SALTY 638 Chaperone protein DnaK Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) Bacteria
BIP1_ORYSJ 665 Heat shock 70 kDa protein BIP1 Oryza sativa subsp. japonica (Rice) Eukaryota
DNAK_HALWD 641 Chaperone protein DnaK Haloquadratum walsbyi (strain DSM 16790 / HBSQ001) Archaea
HSP74_DROME 641 Major heat shock 70 kDa protein Bbb Drosophila melanogaster (Fruit fly) Eukaryota


Table 2. Description of used programs

NameDescription[1]
TCoffee "Consistency-based MSA tool that attempts to mitigate the pitfalls of progressive alignment methods. Suitable for small alignments."
Muscle "Accurate MSA tool, especially good with proteins. Suitable for medium alignments."
Mafft "MSA tool that uses Fast Fourier Transforms. Suitable for medium-large alignments."
PRANK (WebPRANK) "The new phylogeny-aware multiple sequence alignment program which makes use of evolutionary information to help place insertions and deletions."


Alignments


 
UNIPROT|Q97BG8|DNAK_THEVO/1-613UNIPROT|Q84BU4|Q5FJP4|DNAK_LACAC/1-614UNIPROT|Q56073|DNAK_SALTY/1-638UNIPROT|P11147|Q3KN45|Q8SXQ4|Q9VFB0|HSP7D_DROME/1-651UNIPROT|Q6Z7B0|O24182|BIP1_ORYSJ/1-665UNIPROT|Q18GZ4|DNAK_HALWD/1-641ConservationQualityConsensus
102030405060708090100110120130140150160170180190200210220230240250260270280290300310320330340350360370380390400410420430440450460470480490500510520530540550560570580590600610620630640650660670680690700710MS-------------------------------KIIGIDLGTSNSAAAVVISGKPTVIPSSEGVSIGGKAFPSYVAFTKDGQMLVGEPARRQALLNPEGTIFAAKRKMGTDYK---------------------------FKV-FDKEFTPQQISAFILQKIKKDAEAFLGEPVNEAVITVPAYFNDNQRQATKDAGTIAGFDVKRIINEPTAAALAYGVDKSG-KSEKILVFDLGGGTLDVTIIEISK----RPNVQVLSTSGDTQLGGTDMDEAIVNYIADDFQKKEGIDLRKDRGAYIRLRDAAEKAKIELSTTLSSDIDLPYITVTSSGPKHIKMTLTRAKLEELISPIVERVKAPIDKALEGAKLKKTDITKLLFVGGPTRIPYVRKYVEDYL-GIKAEGGVDPMEAVAIGAAIQGAVLKGE----IKDIVLLDVTPLTLSVETLGGIATPIIPANTTIPVRKSQVFTTAEDMQTTVTIHVVQGERPLAKDNVSLGMFNLTGIAPAPRGIPQIEVTFDIDSNGILNVTAVDKATGKKQGITITASTK-LSKDEIERMKKEAEQYAEQDRKMKEQIETLNNAESLAYSVEKTLNEA---GDKVDKETKDRILSEVKDLRKAIEEKN---MDNVKTLMEKISKDIQEVGTKMYQSASSTTQTGSGNQNSSKQENDKT-----------------------VDAEYK------EKSMS-------------------------------KVIGIDLGTTNSAVAVLEGKEPKIITNPEG----NRTTPSVVAFK-DGEIQVGEVAKRQAITNP-NTIVSIKRHMGEA--------------------------DYKVKV-GDKSYTPQEISAFILQYIKKFSEDYLGEEVKDAVITVPAYFNDAQRQATKDAGKIAGLNVQRIINEPTASALAYGLDKDD-DDEKVLVYDLGGGTFDVSVLQLGD----G-VFQVLSTNGDTHLGGDDFDNRIMDWLIKNFKDENGVDLSKDKMAMQRLKDASEKAKKDLSGVSSTHISLPFISAGESGPLHLEADLTRAKFDELTNDLVEKTKIPFDNALKDAGLTVNDIDKVILNGGSTRIPAVQKAVKEWA-GKEPDHSINPDEAVALGAAIQGGVISGD----VKDIVLLDVTPLSLGIETMGGVFTKLIDRNTTIPTSKSQIFSTAADNQPAVDVHVLQGERPMAADDKTLGRFELTDIPPAPRGVPQIQVTFDIDKNGIVNVSAKDMGTGKEQKITIKSSSG-LSDEEIKRMQKDAEEHAEEDKKRKDEADLRNEVDQLIFTTEKTLKET---KGKVSDEDTKKVQEALDDLKKAQKDNN---LDEMKEKKDALSKAAQDLAVKLYQQNGGAQGAA-G------QAGP------------QGGNPNDGNNGGAQDGEFHKVD---PNKMG-------------------------------KIIGIDLGTTNSCVAIMDGTQARVLENAEG----DRTTPSIIAYTQDGETLVGQPAKRQAVTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPYKIIGADNG---DAWLDV-KGQKMAPPQISAEVLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALAYGLDKEV-GNRTIAVYDLGGGTFDISIIEIDEVDGEK-TFEVLATNGDTHLGGEDFDTRLINYLVDEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSAQQTDVNLPYITADATGPKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQDAGLSVSDINDVILVGGQTRMPMVQKKVAEFF-GKEPRKDVNPDEAVAIGAAVQGGVLTGD----VKDVLLLDVTPLSLGIETMGGVMTPLITKNTTIPTKHSQVFSTAEDNQSAVTIHVLQGERKRASDNKSLGQFNLDGINPAPRGMPQIEVTFDIDADGILHVSAKDKNSGKEQKITIKASSG-LNEEEIQKMVRDAEANAESDRKFEELVQTRNQGDHLLHSTRKQVEEA---GDKLPADDKTAIESALNALETALKGED---KAAIEAKMQELAQVSQKLMEIAQQQHAQQQ------------AGS------------ADASANNAKDDDVVDAEFEEVK---DKKMSKA-----------------------------PAVGIDLGTTYSCVGVFQHGKVEIIANDQG----NRTTPSYVAFT-DTERLIGDAAKNQVAMNPTQTIFDAKRLIGRKFDDAAVQSDMKHWPFEVVSADGKPKIEVTYKD-EKKTFFPEEISSMVLTKMKETAEAYLGKTVTNAVITVPAYFNDSQRQATKDAGTIAGLNVLRIINEPTAAAIAYGLDKKAVGERNVLIFDLGGGTFDVSILSIDD----G-IFEVKSTAGDTHLGGEDFDNRLVTHFVQEFKRKHKKDLTTNKRALRRLRTACERAKRTLSSSTQASIEIDSLFEG----TDFYTSITRARFEELNADLFRSTMDPVEKALRDAKLDKSVIHDIVLVGGSTRIPKVQRLLQDLFNGKELNKSINPDEAVAYGAAVQAAILHGDKSQEVQDLLLLDVTPLSLGIETAGGVMSVLIKRNTTIPTKQTQTFTTYSDNQPGVLIQVYEGERAMTKDNNLLGKFELSGIPPAPRGVPQIEVTFDIDANGILNVTALERSTNKENKITITNDKGRLSKEDIERMVNEAEKYRNEDEKQKETIAAKNGLESYCFNMKATLDEDN-LKTKISDSDRTTILDKCNETIKWLDANQLADKEEYEHRQKELEGVCNPIITKLYQGAGFPPGGM--------PGGPGGMP---GAAGAA-----GAAGAGGAGPTIE------EVDMDRVRGCAFLLGVLLAGSLFAFSVAKEETKKLGTVIGIDLGTTYSCVGVYKNGHVEIIANDQG----NRITPSWVAFT-DSERLIGEAAKNQAAVNPERTIFDVKRLIGRKFEDKEVQRDMKLVPYKIVNKDGKPYIQVKIKDGENKVFSPEEVSAMILGKMKETAEAYLGKKINDAVVTVPAYFNDAQRQATKDAGVIAGLNVARIINEPTAAAIAYGLDKKG-GEKNILVFDLGGGTFDVSILTIDN----G-VFEVLATNGDTHLGGEDFDQRIMEYFIKLIKKKYSKDISKDNRALGKLRREAERAKRALSNQHQVRVEIESLFDG----TDFSEPLTRARFEELNNDLFRKTMGPVKKAMDDAGLEKSQIHEIVLVGGSTRIPKVQQLLRDYFEGKEPNKGVNPDEAVAYGAAVQGSILSGEGGDETKDILLLDVAPLTLGIETVGGVMTKLIPRNTVIPTKKSQVFTTYQDQQTTVSIQVFEGERSMTKDCRLLGKFDLSGIPAAPRGTPQIEVTFEVDANGILNVKAEDKGTGKSEKITITNEKGRLSQEEIDRMVREAEEFAEEDKKVKERIDARNQLETYVYNMKNTVGDKDKLADKLESEEKEKVEEALKEALEWLDENQTAEKEEYEEKLKEVEAVCNPIISAVYQRTGGAPGGG--------ADGEGG-----------------------VDDEH-------DELMASN-----------------------------KILGIDLGTTNSAFAVMEGDDPEIIVNAEG----DRTTPSVVAMTDDEERLVGKPAKNQVIQNPDQTIRSIKRHMGEE--------------------------DYTVEL-GGEDYTPEQVSAMILQKIKRDAEEYLGDEIEKAVITVPAYFNDRQRQATKDAGEIAGFEVDRIVNEPTAASMAYGLDDES--NQTILVYDLGGGTFDVSVLDLGG----G-VYEVVATNGDNDLGGDDWDDAVIDWLAGEFEDNHGIDLRDDRQALQRLKDAAEEAKIELSSRKETTINLPFITATDSGPVHLEETLSRAKFESLTEDLIERTVGPTEQALEDAGYDDSDIDEVILVGGSTRMPQVREKVEDLL-GTEPKKNVNPDEAVALGAAIQGGVLAGD----VDDIVLLDVTPLSLGIEVKGGLFERLIDKNTTIPTEESKVFTTAAANQTSVNVRVFQGEREIAEENELLGEFQLAGIPPAPAGTPQIEVTFNIDENGIVNVEAEDQGSGNAESITIEGGAG-LSDEEIEQMQEEAEAHAEEDERRRERIEARNSAESAVQRAETLLEEN---EEDIDDDLKESIEDEVESVEAVLEDED-ATKEEIEDVTESLSSELQEIGKQMYDAQQAAAGAGAGAAGAGAGAGPGGMGDMGDMGDMGGAAGSGDADNEYVDADFEDVDDDTKDE*5-------------------------------479******87*96+963334659+2948*----5967**69*87-*48479*56*97*876**14*+357**59*53--------------------------112664-236285*589*859*479*748*49**63955**9********4*********3***87*2**9*****889***9*644-06669799******8*989+5+53----4-668*58*7**94***6*8*64986887348665445*+55843*739*9547*7**54**73266396+55+535----55833298**9896*646+765653*664*955*574384*469987**6**9*3*86393976-*57554598*5****7***9*+89+3*9----75*99****8**8*89*84**9753+*34**8**845877*8*6457*67*396*68***25839635**5*7*55*67**4*7***8***79*38**95*4*3967876374***57445-*9599*57*459**34476*7937937465*4794575567469393---346943646549563854643576457---644672445394435939723787436555021--------0474-------------------------554652------332 MS++RGCAFLLGVLLAGSLFAFSVAKEETKKLGKIIGIDLGTTNS+VAVMEGGKPEIIAN+EGVSIGNRTTPS+VAFT+DGERLVGEPAK+QA+TNPE+TIF+IKRL+GRKF+D+EVQRDMK++PYKIV+ADGKP+ID++VKVG++K+FTPE+ISAMILQK+KKTAEAYLGE+V++AVITVPAYFNDAQRQATKDAGTIAGLNVKRIINEPTAAALAYGLDK+GVG+++ILV+DLGGGTFDVSILEIDDVDGEGPVFEVL+TNGDTHLGGEDFDNRI++YL++EFKKKHGIDLRKD+RALQRL+DAAEKAKIELSS++QTDI+LP+ITAG+SGP+H+EETLTRAKFEEL++DLVERT+GPV+KALEDAGLDKSDI++VILVGGSTRIPKVQK+VED+F+GKEPNK+VNPDEAVA+GAA+QGGVLSGD+++EVKDI+LLDVTPLSLGIETMGGVMT+LI+RNTTIPTKKSQVFTTA+DNQT+VTIHV+QGERPMAKDNKLLGKF+L+GIPPAPRG+PQIEVTFDIDANGILNV+A+DKGTGKEQKITIT+S+GRLS+EEIERMV+EAE++AEED+KRKERI+ARNQ+ESLV+++EKTLEEA+KL+DK++DEDK+KIE+AL++LEKAL++N++A+KEE+EEKM+ELSKVCQ+I+TK+YQQAGGA+G+G+G++++++QAGPGGM+DMG+++++AGA++NGAAD+G+VDAEFE+VDDDT+KK
TCoffee alignment (Download)
 
UNIPROT|Q97BG8|DNAK_THEVO/1-613UNIPROT|Q84BU4|Q5FJP4|DNAK_LACAC/1-614UNIPROT|Q56073|DNAK_SALTY/1-638UNIPROT|P11147|Q3KN45|Q8SXQ4|Q9VFB0|HSP7D_DROME/1-651UNIPROT|Q6Z7B0|O24182|BIP1_ORYSJ/1-665UNIPROT|Q18GZ4|DNAK_HALWD/1-641ConservationQualityConsensus
102030405060708090100110120130140150160170180190200210220230240250260270280290300310320330340350360370380390400410420430440450460470480490500510520530540550560570580590600610620630640650660670680690700710-------------------------------MSKIIGIDLGTSNSAAAVVISGKPTVIPSSEGVSIGGKAFPSYVAFTKDGQMLVGEPARRQALLNPEGTIFAAKRKMG------------------------TDY-KFKVFD---KEFTPQQISAFILQKIKKDAEAFLGEPVNEAVITVPAYFNDNQRQATKDAGTIAGFDVKRIINEPTAAALAYGVDKSG-KSEKILVFDLGGGTLDVTIIEI---SKRPNVQVLSTSGDTQLGGTDMDEAIVNYIADDFQKKEGIDLRKDRGAYIRLRDAAEKAKIELSTTLSSDIDLPYITVTSSGPKHIKMTLTRAKLEELISPIVERVKAPIDKALEGAKLKKTDITKLLFVGGPTRIPYVRKYVEDYL-GIKAEGGVDPMEAVAIGAAIQGAVLKGE----IKDIVLLDVTPLTLSVETLGGIATPIIPANTTIPVRKSQVFTTAEDMQTTVTIHVVQGERPLAKDNVSLGMFNLTGIAPAPRGIPQIEVTFDIDSNGILNVTAVDKATGKKQGITITASTK-LSKDEIERMKKEAEQYAEQDRKMKEQIETLNNAESLAYSVEKTL---NEAGDKVDKETKDRILSEVKDLRKAIEEKN---MDNVKTLMEKISKDIQEVGTKMYQS----ASST------------------TQTGSGNQNSSKQE-NDKTVDAEYKEKS-------------------------------------MSKVIGIDLGTTNSAVAVLEGKEPKIITNPE----GNRTTPSVVAF-KDGEIQVGEVAKRQAITNP-NTIVSIKRHMG-----------------------EADY-KVKVGD---KSYTPQEISAFILQYIKKFSEDYLGEEVKDAVITVPAYFNDAQRQATKDAGKIAGLNVQRIINEPTASALAYGLDKDD-DDEKVLVYDLGGGTFDVSVLQL----GDGVFQVLSTNGDTHLGGDDFDNRIMDWLIKNFKDENGVDLSKDKMAMQRLKDASEKAKKDLSGVSSTHISLPFISAGESGPLHLEADLTRAKFDELTNDLVEKTKIPFDNALKDAGLTVNDIDKVILNGGSTRIPAVQKAVKEWA-GKEPDHSINPDEAVALGAAIQGGVISGD----VKDIVLLDVTPLSLGIETMGGVFTKLIDRNTTIPTSKSQIFSTAADNQPAVDVHVLQGERPMAADDKTLGRFELTDIPPAPRGVPQIQVTFDIDKNGIVNVSAKDMGTGKEQKITIKSSSG-LSDEEIKRMQKDAEEHAEEDKKRKDEADLRNEVDQLIFTTEKTL---KETKGKVSDEDTKKVQEALDDLKKAQKDNN---LDEMKEKKDALSKAAQDLAVKLYQQNGGAQGAA------------------GQAGPQGGNPNDGN-NGGAQDGEFHKVDPNK----------------------------------MGKIIGIDLGTTNSCVAIMDGTQARVLENAE----GDRTTPSIIAYTQDGETLVGQPAKRQAVTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPYKIIGADNGDA-WLDVKG---QKMAPPQISAEVLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALAYGLDKEV-GNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDTRLINYLVDEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSAQQTDVNLPYITADATGPKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQDAGLSVSDINDVILVGGQTRMPMVQKKVAEFF-GKEPRKDVNPDEAVAIGAAVQGGVLTGD----VKDVLLLDVTPLSLGIETMGGVMTPLITKNTTIPTKHSQVFSTAEDNQSAVTIHVLQGERKRASDNKSLGQFNLDGINPAPRGMPQIEVTFDIDADGILHVSAKDKNSGKEQKITIKASSG-LNEEEIQKMVRDAEANAESDRKFEELVQTRNQGDHLLHSTRKQV---EEAGDKLPADDKTAIESALNALETALKGED---KAAIEAKMQELAQVSQKL-MEIAQQ----QHAQ------------------QQAGSADASANNAK-DDDVVDAEFEEVKDKK--------------------------------MSKAPAVGIDLGTTYSCVGVFQHGKVEIIANDQ----GNRTTPSYVAF-TDTERLIGDAAKNQVAMNPTQTIFDAKRLIGRKFDDAAVQSDMKHWPFEVVSADGKPKIEVTYKD-EKKTFFPEEISSMVLTKMKETAEAYLGKTVTNAVITVPAYFNDSQRQATKDAGTIAGLNVLRIINEPTAAAIAYGLDKKAVGERNVLIFDLGGGTFDVSILSI----DDGIFEVKSTAGDTHLGGEDFDNRLVTHFVQEFKRKHKKDLTTNKRALRRLRTACERAKRTLSSSTQASIEIDSLFEGT----DFYTSITRARFEELNADLFRSTMDPVEKALRDAKLDKSVIHDIVLVGGSTRIPKVQRLLQDLFNGKELNKSINPDEAVAYGAAVQAAILHGDKSQEVQDLLLLDVTPLSLGIETAGGVMSVLIKRNTTIPTKQTQTFTTYSDNQPGVLIQVYEGERAMTKDNNLLGKFELSGIPPAPRGVPQIEVTFDIDANGILNVTALERSTNKENKITITNDKGRLSKEDIERMVNEAEKYRNEDEKQKETIAAKNGLESYCFNMKATL-DEDNLKTKISDSDRTTILDKCNETIKWLDANQLADKEEYEHRQKELEGVCNPIITKLYQGAGFPPGGMPG----------------GPGGMPGAAGAAGAAGAGGAGPTIEEVD------MDRVRGCAFLLGVLLAGSLFAFSVAKEETKKLGTVIGIDLGTTYSCVGVYKNGHVEIIANDQ----GNRITPSWVAF-TDSERLIGEAAKNQAAVNPERTIFDVKRLIGRKFEDKEVQRDMKLVPYKIVNKDGKPYIQVKIKDGENKVFSPEEVSAMILGKMKETAEAYLGKKINDAVVTVPAYFNDAQRQATKDAGVIAGLNVARIINEPTAAAIAYGLDKKG-GEKNILVFDLGGGTFDVSILTI----DNGVFEVLATNGDTHLGGEDFDQRIMEYFIKLIKKKYSKDISKDNRALGKLRREAERAKRALSNQHQVRVEIESLFDGT----DFSEPLTRARFEELNNDLFRKTMGPVKKAMDDAGLEKSQIHEIVLVGGSTRIPKVQQLLRDYFEGKEPNKGVNPDEAVAYGAAVQGSILSGEGGDETKDILLLDVAPLTLGIETVGGVMTKLIPRNTVIPTKKSQVFTTYQDQQTTVSIQVFEGERSMTKDCRLLGKFDLSGIPAAPRGTPQIEVTFEVDANGILNVKAEDKGTGKSEKITITNEKGRLSQEEIDRMVREAEEFAEEDKKVKERIDARNQLETYVYNMKNTVGDKDKLADKLESEEKEKVEEALKEALEWLDENQTAEKEEYEEKLKEVEAVCNPIISAVYQR-------------------------------TGGAPGGGADGEGGVDDEHDEL------------------------------------MASNKILGIDLGTTNSAFAVMEGDDPEIIVNAE----GDRTTPSVVAMTDDEERLVGKPAKNQVIQNPDQTIRSIKRHMG-----------------------EEDY-TVELGG---EDYTPEQVSAMILQKIKRDAEEYLGDEIEKAVITVPAYFNDRQRQATKDAGEIAGFEVDRIVNEPTAASMAYGLDDES--NQTILVYDLGGGTFDVSVLDL----GGGVYEVVATNGDNDLGGDDWDDAVIDWLAGEFEDNHGIDLRDDRQALQRLKDAAEEAKIELSSRKETTINLPFITATDSGPVHLEETLSRAKFESLTEDLIERTVGPTEQALEDAGYDDSDIDEVILVGGSTRMPQVREKVEDLL-GTEPKKNVNPDEAVALGAAIQGGVLAGD----VDDIVLLDVTPLSLGIEVKGGLFERLIDKNTTIPTEESKVFTTAAANQTSVNVRVFQGEREIAEENELLGEFQLAGIPPAPAGTPQIEVTFNIDENGIVNVEAEDQGSGNAESITIEGGAG-LSDEEIEQMQEEAEAHAEEDERRRERIEARNSAESAVQRAETLL---EENEEDIDDDLKESIEDEVESVEAVLEDED-ATKEEIEDVTESLSSELQEIGKQMYDAQQAAAGAGAGAAGAGAGAGPGGMGDMGDMGDMGGAAGSGDADNEYVDADFEDVDDDTKDE-------------------------------37479******87*96+963334659+2948----*5967**69*8-5*48479*56*97*876**14*+357**59*-----------------------0364-475645---6285*589*859*479*748*49**63955**9********4*********3***87*2**9*****889***9*644-06669799******8*989+5+----333668*58*7**94***6*8*64986887348665445*+55843*739*9547*7**54**73266396+55+5354----5833298**9896*646+765653*664*955*574384*469987**6**9*3*86393976-*57554598*5****7***9*+89+3*9----75*99****8**8*89*84**9753+*34**8**845877*8*6457*67*396*68***25839635**5*7*55*67**4*7***8***79*38**95*4*3967876374***57445-*9599*57*459**34476*7937937465*4794575567469---555346943646549563854643576457---6446724453944359391338674----0020------------------00130556767363-5435554655651------ MDRVRGCAFLLGVLLAGSLFAFSVAKEETM+M+KIIGIDLGTTNS+VAVMEGGKPEIIAN+EGVSIGNRTTPS+VAFT+DGERLVGEPAK+QA+TNPE+TIF+IKRL+GRKF+D+EVQRDMK++PYKIV+AD+KDYIKVKVKDGE+K+FTPE+ISAMILQK+KKTAEAYLGE+V++AVITVPAYFNDAQRQATKDAGTIAGLNVKRIINEPTAAALAYGLDK+GVG+++ILV+DLGGGTFDVSILEIDEV+GDGVFEVL+TNGDTHLGGEDFDNRI++YL++EFKKKHGIDLRKD+RALQRL+DAAEKAKIELSS++QTDI+LP+ITAGTSGPKH+EETLTRAKFEEL++DLVERT+GPV+KALEDAGLDKSDI++VILVGGSTRIPKVQK+VED+F+GKEPNK+VNPDEAVA+GAA+QGGVLSGD+++EVKDI+LLDVTPLSLGIETMGGVMT+LI+RNTTIPTKKSQVFTTA+DNQT+VTIHV+QGERPMAKDNKLLGKF+L+GIPPAPRG+PQIEVTFDIDANGILNV+A+DKGTGKEQKITIT+S+GRLS+EEIERMV+EAE++AEED+KRKERI+ARNQ+ESLV+++EKTLGD++E++DK++DEDK+KIE+AL++LEKAL++N++A+KEE+EEKM+ELSKVCQ+I+TK+YQQ+G+A+GA++GAAGAGAGAGPGGMGDMGQAGS+GGA+++GAA+DGGVDAEFEEVDD+KKDE
Muscle alignment (Download)
 
UNIPROT|Q97BG8|DNAK_THEVO/1-613UNIPROT|Q84BU4|Q5FJP4|DNAK_LACAC/1-614UNIPROT|Q56073|DNAK_SALTY/1-638UNIPROT|P11147|Q3KN45|Q8SXQ4|Q9VFB0|HSP7D_DROME/1-651UNIPROT|Q6Z7B0|O24182|BIP1_ORYSJ/1-665UNIPROT|Q18GZ4|DNAK_HALWD/1-641ConservationQualityConsensus
102030405060708090100110120130140150160170180190200210220230240250260270280290300310320330340350360370380390400410420430440450460470480490500510520530540550560570580590600610620630640650660670680690700710720--MSK-------------------------------IIGIDLGTSNSAAAVVISGKPTVIPSSEGVSIGGKAFPSYVAFTKDGQMLVGEPARRQALLNPEGTIFAAKRKMG--------------TDYKF----------KVFDKE-----FTPQQISAFILQKIKKDAEAFLGEPVNEAVITVPAYFNDNQRQATKDAGTIAGFDVKRIINEPTAAALAYGVDKSG-KSEKILVFDLGGGTLDVTIIEIS---KRPNVQVLSTSGDTQLGGTDMDEAIVNYIADDFQKKEGIDLRKDRGAYIRLRDAAEKAKIELSTTLSSDIDLPYITVTSSGPKHIKMTLTRAKLEELISPIVERVKAPIDKALEGAKLKKTDITKLLFVGGPTRIPYVRKYVEDYL-GIKAEGGVDPMEAVAIGAAIQGAVLKGE----IKDIVLLDVTPLTLSVETLGGIATPIIPANTTIPVRKSQVFTTAEDMQTTVTIHVVQGERPLAKDNVSLGMFNLTGIAPAPRGIPQIEVTFDIDSNGILNVTAVDKATGKKQGITITASTK-LSKDEIERMKKEAEQYAEQDRKMKEQIETLNNAESLAYSVEKTL---NEAGDKVDKETKDRILSEVKD-LRKAIEEKN--MDNVKTLMEKISKDIQEVGTKMYQS----ASS--------TTQTGSGNQNSSKQ-------------------ENDKTVDAEYKEKS--------MSK-------------------------------VIGIDLGTTNSAVAVLEGKEPKIITNPE----GNRTTPSVVAF-KDGEIQVGEVAKRQAITNP-NTIVSIKRHMGE-------------ADYKV----------KVGDKS-----YTPQEISAFILQYIKKFSEDYLGEEVKDAVITVPAYFNDAQRQATKDAGKIAGLNVQRIINEPTASALAYGLDKDD-DDEKVLVYDLGGGTFDVSVLQL----GDGVFQVLSTNGDTHLGGDDFDNRIMDWLIKNFKDENGVDLSKDKMAMQRLKDASEKAKKDLSGVSSTHISLPFISAGESGPLHLEADLTRAKFDELTNDLVEKTKIPFDNALKDAGLTVNDIDKVILNGGSTRIPAVQKAVKEWA-GKEPDHSINPDEAVALGAAIQGGVISGD----VKDIVLLDVTPLSLGIETMGGVFTKLIDRNTTIPTSKSQIFSTAADNQPAVDVHVLQGERPMAADDKTLGRFELTDIPPAPRGVPQIQVTFDIDKNGIVNVSAKDMGTGKEQKITIKSSSG-LSDEEIKRMQKDAEEHAEEDKKRKDEADLRNEVDQLIFTTEKTL---KETKGKVSDEDTKKVQEALDD-LKKAQKDNN--LDEMKEKKDALSKAAQDLAVKLYQQNGGAQGA--------AGQAGPQGGNPNDG-------------------NNGGAQDGEFHKVDPNK-----MGK-------------------------------IIGIDLGTTNSCVAIMDGTQARVLENAE----GDRTTPSIIAYTQDGETLVGQPAKRQAVTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPYKIIGADNGDAWLDVKGQK-----MAPPQISAEVLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALAYGLDKEV-GNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDTRLINYLVDEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSAQQTDVNLPYITADATGPKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQDAGLSVSDINDVILVGGQTRMPMVQKKVAEFF-GKEPRKDVNPDEAVAIGAAVQGGVLTGD----VKDVLLLDVTPLSLGIETMGGVMTPLITKNTTIPTKHSQVFSTAEDNQSAVTIHVLQGERKRASDNKSLGQFNLDGINPAPRGMPQIEVTFDIDADGILHVSAKDKNSGKEQKITIKASSG-LNEEEIQKMVRDAEANAESDRKFEELVQTRNQGDHLLHSTRKQV---EEAGDKLPADDKTAIESALNA-LETALKGED--KAAIEAKMQELAQVSQKL-MEIAQQ----QHA--------QQQAGSADASANNA-------------------KDDDVVDAEFEEVKDKK-----MSK-----------------------------APAVGIDLGTTYSCVGVFQHGKVEIIANDQ----GNRTTPSYVAFT-DTERLIGDAAKNQVAMNPTQTIFDAKRLIGRKFDDAAVQSDMKHWPFEVVSADGKPKI-EVTYKD-EKKTFFPEEISSMVLTKMKETAEAYLGKTVTNAVITVPAYFNDSQRQATKDAGTIAGLNVLRIINEPTAAAIAYGLDKKAVGERNVLIFDLGGGTFDVSILSI----DDGIFEVKSTAGDTHLGGEDFDNRLVTHFVQEFKRKHKKDLTTNKRALRRLRTACERAKRTLSSSTQASIEIDSLFEGTD----FYTSITRARFEELNADLFRSTMDPVEKALRDAKLDKSVIHDIVLVGGSTRIPKVQRLLQDLFNGKELNKSINPDEAVAYGAAVQAAILHGDKSQEVQDLLLLDVTPLSLGIETAGGVMSVLIKRNTTIPTKQTQTFTTYSDNQPGVLIQVYEGERAMTKDNNLLGKFELSGIPPAPRGVPQIEVTFDIDANGILNVTALERSTNKENKITITNDKGRLSKEDIERMVNEAEKYRNEDEKQKETIAAKNGLESYCFNMKATL-DEDNLKTKISDSDRTTILDKCNETIKWLDANQLADKEEYEHRQKELEGVCNPIITKLYQGAGFPPGGMPGGPGGMPGAAGAAGAAGAGG------------------------AGPTIEEVD--------MDRVRGCAFLLGVLLAGSLFAFSVAKEETKKLGTVIGIDLGTTYSCVGVYKNGHVEIIANDQ----GNRITPSWVAFT-DSERLIGEAAKNQAAVNPERTIFDVKRLIGRKFEDKEVQRDMKLVPYKIVNKDGKPYI-QVKIKDGENKVFSPEEVSAMILGKMKETAEAYLGKKINDAVVTVPAYFNDAQRQATKDAGVIAGLNVARIINEPTAAAIAYGLDKKG-GEKNILVFDLGGGTFDVSILTI----DNGVFEVLATNGDTHLGGEDFDQRIMEYFIKLIKKKYSKDISKDNRALGKLRREAERAKRALSNQHQVRVEIESLFDGTD----FSEPLTRARFEELNNDLFRKTMGPVKKAMDDAGLEKSQIHEIVLVGGSTRIPKVQQLLRDYFEGKEPNKGVNPDEAVAYGAAVQGSILSGEGGDETKDILLLDVAPLTLGIETVGGVMTKLIPRNTVIPTKKSQVFTTYQDQQTTVSIQVFEGERSMTKDCRLLGKFDLSGIPAAPRGTPQIEVTFEVDANGILNVKAEDKGTGKSEKITITNEKGRLSQEEIDRMVREAEEFAEEDKKVKERIDARNQLETYVYNMKNTVGDKDKLADKLESEEKEKVEEALKEALEWLDENQTAEKEEYEEKLKEVEAVCNPIISAVYQRTGGAPGG---------------GADGEGG------------------------VDDEHDEL-------MASNK-------------------------------ILGIDLGTTNSAFAVMEGDDPEIIVNAE----GDRTTPSVVAMTDDEERLVGKPAKNQVIQNPDQTIRSIKRHMGE-------------EDYTV----------ELGGED-----YTPEQVSAMILQKIKRDAEEYLGDEIEKAVITVPAYFNDRQRQATKDAGEIAGFEVDRIVNEPTAASMAYGLD-DE-SNQTILVYDLGGGTFDVSVLDL----GGGVYEVVATNGDNDLGGDDWDDAVIDWLAGEFEDNHGIDLRDDRQALQRLKDAAEEAKIELSSRKETTINLPFITATDSGPVHLEETLSRAKFESLTEDLIERTVGPTEQALEDAGYDDSDIDEVILVGGSTRMPQVREKVEDLL-GTEPKKNVNPDEAVALGAAIQGGVLAGD----VDDIVLLDVTPLSLGIEVKGGLFERLIDKNTTIPTEESKVFTTAAANQTSVNVRVFQGEREIAEENELLGEFQLAGIPPAPAGTPQIEVTFNIDENGIVNVEAEDQGSGNAESITIEGGAG-LSDEEIEQMQEEAEAHAEEDERRRERIEARNSAESAVQRAETLL---EENEEDIDDDLKESIEDEVES-VEAVLEDEDATKEEIEDVTESLSSELQEIGKQMYDAQQAAAGA--------GAGAAGAGAGAGPGGMGDMGDMGDMGGAAGSGDADNEYVDADFEDVDDDTKDE--659-------------------------------79******87*96+963334659+2948----*5967**69*83-*48479*56*97*876**14*+357**59*2-------------26957----------594264-----85*589*859*479*748*49**63955**9********4*********3***87*2**9*****889***9*443-36669799******8*989+5+----333668*58*7**94***6*8*64986887348665445*+55843*739*9547*7**54**73266396+55+53546----833298**9896*646+765653*664*955*574384*469987**6**9*3*86393976-*57554598*5****7***9*+89+3*9----75*99****8**8*89*84**9753+*34**8**845877*8*6457*67*396*68***25839635**5*7*55*67**4*7***8***79*38**95*4*3967876374***57445-*9599*57*459**34476*7937937465*4794575567469---555346943646549563854-974743454--6446724453944359391338674----538--------00023105656426------------------------554655651------ MAMSKVRGCAFLLGVLLAGSLFAFSVAKEETKKL++IIGIDLGTTNS+VAVMEGGKPEIIAN+EGVSIGNRTTPS+VAFTKDGERLVGEPAK+QA+TNPE+TIF+IKRL+GRKF+D+EVQRDMK+++YKVV+ADGKP+IL+V++KDGE+K+FTPE+ISAMILQK+KKTAEAYLGE+V++AVITVPAYFNDAQRQATKDAGTIAGLNVKRIINEPTAAALAYGLDK+GVG+++ILV+DLGGGTFDVSILEI+EVDGDGVFEVL+TNGDTHLGGEDFDNRI++YL++EFKKKHGIDLRKD+RALQRL+DAAEKAKIELSS++QTDI+LP+ITAGTSGPKH+EETLTRAKFEEL++DLVERT+GPV+KALEDAGLDKSDI++VILVGGSTRIPKVQK+VED+F+GKEPNK+VNPDEAVA+GAA+QGGVLSGD+++EVKDI+LLDVTPLSLGIETMGGVMT+LI+RNTTIPTKKSQVFTTA+DNQT+VTIHV+QGERPMAKDNKLLGKF+L+GIPPAPRG+PQIEVTFDIDANGILNV+A+DKGTGKEQKITIT+S+GRLS+EEIERMV+EAE++AEED+KRKERI+ARNQ+ESLV+++EKTLGD++E++DK++DEDK+KIE+AL+++LE+A+E+++A+KEE+EEKM+ELSKVCQ+I+TK+YQQ+GGA+GAMPGGPGGM+GQAGSAGAN+NGGGMGDMGDMGDMGGAAGSGD++D++VDAEFEEVDD+KKDE
Mafft alignment (Download)
PRANK alignment
webPRANK alignment
Font color:  size:

Comparing of Muscle and Mafft alignments (noticed differences)

  1. Lysine (K) #3 (sequence #3)
    Muscle: stands in the position with 3 more K (lysine #3, #3, #5), P (proline #5) and T (threonine #34)
    Mafft: stands in the position with 4 more K (lysine) and R (arginine #3)
  2. Glutamic acid (E) #73 (sequence #2)
    Muscle: stands in the position with N (asparagine #98), 2 G (glycine #99, #128), E (glutamic acid #77)
    Mafft: stands in the position with 3 R (arginine #75, #76 , #105), E (glutamic acid #77)
  3. Tyrosine (Y) #81 (sequence #1)
    Muscle: stands in the position with 3 more Y (tyrosine #76, #131, #80), A (alanine #101), K (lysine #102)
    Mafft: stands in the position with 4 more Y (tyrosine #76, #91, #121, #80), F (phenylalanine #92)

Task 2


Figure 1. Protein domain architecture
Figure 2. Pfam data

My protein consists of one domain so it basically left me no choice and I will describe 3 proteins with this domain.



Table 3. Description of 3 differents architectures with Bac_globin domain

ProteinAmount of sequences
with the following architecture
Description
1 This protein consists of Bac_globin only as mine too, however, there is 5 Bac_globins, with deletions in the first, second and the fifth. Bacterial-like globin (Bac_globin) - the family of heme binding proteins, which are found mainly in bacteria. According to the Pfam article, they can also be found in some protozoa and plants.
Molecular function: oxygen binding
Putative uncharacterized protein (Perkinsus marinus)
1 Bac_globin: Deletion on the right
Alpha Amylase: According to the Pfam article, enzymes containing this domain belong to family 13 (CAZY GH_13) of the glycosyl hydrolases (it is a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate part).
Molecular function: catalytic activity
Group 2 hemoglobin glbO (Mycobacterium xenopi)
1 Bac_globin:
Cytochrome P460: There is no Pfam article about this domain yet, so my description of it's function was based on NCBI article.[2]
Molecular function: catalysation of the oxidation of hydroxylamine to nitrite
Hemoglobin-like protein HbN (Nitrospira moscoviensis)

References

  1. Multiple Sequence Alignment
  2. David J. Bergmann, James A. Zahn, Alan B. Hooper, and Alan A. DiSpirito. "Cytochrome P460 Genes from the Methanotroph Methylococcus capsulatus Bath." J Bacteriol. 1998 Dec; 180(24): 6440–6445.

Back to term2 page 🚶

© Sophia Veselova, 2017.