Учебный сайт Дюгая Ильи

Главная

Первый семестр

Второй семестр

Ссылки

Об авторе

Знакомство со структурой банка RefSeq с помощью поисковой системы SRS

Через SRS я получил список хромосом дрожжей Saccharomyces cerevisiae.

		
REFSEQ_DNA:NC_001133	NC_001133	Saccharomyces cerevisiae S288c chromosome I, complete sequence. 	230218	
REFSEQ_DNA:NC_001134	NC_001134	Saccharomyces cerevisiae S288c chromosome II, complete sequence. 	813184	
REFSEQ_DNA:NC_001135	NC_001135	Saccharomyces cerevisiae S288c chromosome III, complete sequence. 	316620	
REFSEQ_DNA:NC_001136	NC_001136	Saccharomyces cerevisiae S288c chromosome IV, complete sequence. 	1531933	
REFSEQ_DNA:NC_001137	NC_001137	Saccharomyces cerevisiae S288c chromosome V, complete sequence. 	576874	
REFSEQ_DNA:NC_001138	NC_001138	Saccharomyces cerevisiae S288c chromosome VI, complete sequence. 	270161	
REFSEQ_DNA:NC_001139	NC_001139	Saccharomyces cerevisiae S288c chromosome VII, complete sequence. 	1090940	
REFSEQ_DNA:NC_001140	NC_001140	Saccharomyces cerevisiae S288c chromosome VIII, complete sequence. 	562643	
REFSEQ_DNA:NC_001141	NC_001141	Saccharomyces cerevisiae S288c chromosome IX, complete sequence. 	439888	
REFSEQ_DNA:NC_001142	NC_001142	Saccharomyces cerevisiae S288c chromosome X, complete sequence. 	745751	
REFSEQ_DNA:NC_001143	NC_001143	Saccharomyces cerevisiae S288c chromosome XI, complete sequence. 	666816	
REFSEQ_DNA:NC_001144	NC_001144	Saccharomyces cerevisiae S288c chromosome XII, complete sequence. 	1078177	
REFSEQ_DNA:NC_001145	NC_001145	Saccharomyces cerevisiae S288c chromosome XIII, complete sequence. 	924431	
REFSEQ_DNA:NC_001146	NC_001146	Saccharomyces cerevisiae S288c chromosome XIV, complete sequence. 	784333	
REFSEQ_DNA:NC_001147	NC_001147	Saccharomyces cerevisiae S288c chromosome XV, complete sequence. 	1091291	
REFSEQ_DNA:NC_001148	NC_001148	Saccharomyces cerevisiae S288c chromosome XVI, complete sequence. 	948066	
	

Мне была задана восьмая хромосома. Ее длина 562643 нуклеотидов, количество генов и тРНК в ней - 297 и 11 соответственно. Для нее я привожу примеры четырёх генов на заданной хромосоме:

– гена, который находится на прямой цепи и без интронов;

 	gene            6401..7546
                     /gene="COS8"
                     /locus_tag="YHL048W"
                     /db_xref="GeneID:856337"
     mRNA            6401..7546
                     /gene="COS8"
                     /locus_tag="YHL048W"
                     /product="Cos8p"
                     /transcript_id="NM_001179128.1"
                     /db_xref="GI:296145324"
                     /db_xref="GeneID:856337"
     CDS             6401..7546
                     /gene="COS8"
                     /locus_tag="YHL048W"
                     /note="Nuclear membrane protein, member of the DUP380
                     subfamily of conserved, often subtelomerically-encoded
                     proteins; regulation suggests a potential role in the
                     unfolded protein response"
                     /codon_start=1
                     /product="Cos8p"
                     /protein_id="NP_011815.1"
                     /db_xref="GI:6321739"
                     /db_xref="SGD:S000001040"
                     /db_xref="GeneID:856337"
                     /translation="MKENEVKDEKSVDVLSFKQLEFQKTVLPQDVFRNELTWFCYEIY
                     KSLAFRIWMLLWLPLSVWWKLSSNWIHPLIVSLLVLFLGPFFVLVICGLSRKRSLSKQ
                     LIQFCKEITEDTPSSDPHDWEVVAANLNSYFYENKTWNTKYFFFNAMSCQKAFKTTLL
                     EPFSLKKDESAKVKSFKDSVPYIEEALQVYAAGFDKEWKLFNTEKEESPFDLEDIQLP
                     KEAYRFKLTWILKRIFNLRCLPLFLYYFLIVYTSGNADLISRFLFPVVMFFIMTRDFQ
                     NMRMIVLSVKMEHKMQFLSTIINEQESGANGWDEIAKKMNRYLFEKKVWNNEEFFYDG
                     LDCEWFFRRFFYRLLSLKKPMWFASLNVELWPYIKEAQSARNEKPLK"
– гена, который находится на обратной цепи и без интронов;
 	gene            complement(3726..4541)
                     /locus_tag="YHL049C"
                     /db_xref="GeneID:856336"
     mRNA            complement(3726..4541)
                     /locus_tag="YHL049C"
                     /product="hypothetical protein"
                     /transcript_id="NM_001179129.1"
                     /db_xref="GI:296145322"
                     /db_xref="GeneID:856336"
     CDS             complement(3726..4541)
                     /locus_tag="YHL049C"
                     /codon_start=1
                     /product="hypothetical protein"
                     /protein_id="NP_011814.1"
                     /db_xref="GI:6321738"
                     /db_xref="SGD:S000001041"
                     /db_xref="GeneID:856336"
                     /translation="MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVR
                     SFYEDEKSGLIKVVKFRTGAMDRKRSFEKIVVSVMVGKNVQKFLTFVEDEPDFQGGPI
                     PSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIA
                     SARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPHMFLLLHVDELSIFSAYQ
                     ASLPGEKKVDTERLKRDLCPRKPTEIKYFSQICNDMMNKKDRLGDVLHVCCPS"
– гена, который находится на прямой цепи и имеет хотя бы один интрон;
	gene            85909..91318
                     /locus_tag="YHL009W-B"
                     /db_xref="GeneID:856380"
     mRNA            join(<85909..86994,86996..>91318)
                     /locus_tag="YHL009W-B"
                     /product="gag-pol fusion protein"
                     /transcript_id="NM_001184404.1"
                     /db_xref="GI:296145363"
                     /db_xref="GeneID:856380"
     CDS             join(85909..86994,86996..91318)
                     /locus_tag="YHL009W-B"
                     /EC_number="2.7.7.49"
                     /EC_number="2.7.7.7"
                     /EC_number="3.4.23.-"
                     /EC_number="3.1.26.4"
                     /ribosomal_slippage
                     /note="Retrotransposon TYA Gag and TYB Pol genes;
                     transcribed/translated as one unit; polyprotein is
                     processed to make a nucleocapsid-like protein (Gag),
                     reverse transcriptase (RT), protease (PR), and integrase
                     (IN); similar to retroviral genes"
                     /codon_start=1
                     /product="gag-pol fusion protein"
                     /protein_id="NP_058133.1"
                     /db_xref="GI:7839180"
                     /db_xref="SGD:S000007372"
                     /db_xref="GeneID:856380"
                     /translation="MATPVRDETRNVIDDNISARIQSKVKTNDTVRQTPSSLRKVSIK
                     DEQVKQYQRNLNRFKTILNGLKAEEEKLSETDDIQMLAEKLLKLGETIDKVENRIVDL
                     VEKIQLLETNENNNILHEHIDATGTYYLFDTLTSTNKRFYPKDCVFDYRTNNVENIPI
                     LLNNFKKFIKKYQFDDVFENDIIEIDPRENEILCKIIKEGLGESLDIMNTNTTDIFRI
                     IDGLKNKYRSLHGRDVRIRAWEKVLVDTTCRNSALLMNKLQKLVLMEKWIFSKCCQDC
                     PNLKDYLQEAIMGTLHESLRNSVKQRLYNIPHNVGINHEEFLINTVIETVIDLSPIAD
                     DQIENSCMYCKSVFHCSINCKKKPNRELGLTRPISQKPIIYKVHRDNNNLSPVQNEQK
                     SWNKTQKKSNKVYNSKKLVIIDTGSGVNITNDKTLLHNYEDSNRSTRFFGIGKNSSVS
                     VKGYGYIKIKNGHNNTDNKCLLTYYVPEEESTIISCYDLAKKTKMVLSRKYTRLGNKI
                     IKIKTKIVNGVIHVKMNELIERPSDDSKINAIKPTSSPGFKLNKRSITLEDAHKRMGH
                     TGIQQIENSIKHNHYEESLDLIKEPNEFWCQTCKISKATKRNHYTGSMNNHSTDHEPG
                     SSWCMDIFGPVSSSNADTKRYMLIMVDNNTRYCMTSTHFNKNAETILAQIRKNIQYVE
                     TQFDRKVREINSDRGTEFTNDQIEEYFISKGIHHILTSTQDHAANGRAERYIRTIVTD
                     ATTLLRQSNLRVKFWEYAVTSATNIRNCLEHKSTGKLPLKAISRQPVTVRLMSFLPFG
                     EKGIIWNHNHKKLKPSGLPSIILCKDPNSYGYKFFIPSKNKIVTSDNYTIPNYTMDGR
                     VRNTQNIYKSHQFSSHNDNEEDQIETVTNLCEALENYEDDNKPITRLEDLFTEEELSQ
                     IDSNAKYPSPSNNLEGDLDYVFSDVEESGDYDVESELSTTNTSISTDKNKILSNKDFN
                     SELASTEISISEIDKKGLINTSHIDEDKYDEKVHRIPSIIQEKLVGSKNTIKINDENR
                     ISDRIRSKNIGSILNTGLSRCVDITDESITNKDESMHNAKPELIQEQFNKTNHETSFP
                     KEGSIGTNVKFRNTDNEISLKTGDTSLPIKTLESINNHHSNDYSTNKVEKFEKENHHP
                     PPIEDIVDMSDQTDMESNCQDGNNLKELKVTDKNVPTDNGTNVSPRLEQNIEASGSPV
                     QTVNKSAFLNKEFSSLNMKRKRKRHDKNNSLTSYELERDKKRSKRNRVKLIPDNMETV
                     SAQKIRAIYYNEAISKNPDLKEKHEYKQAYHKELQNLKDMKVFDVDVKYSRSEIPDNL
                     IVPTNTIFTKKRNGIYKARIVCRGDTQSPDTYSVITTESLNHNHIKIFLMIANNRNMF
                     MKTLDINHAFLYAKLEEEIYIPHPHDRRCVVKLNKALYGLKQSPKEWNDHLRQYLNGI
                     GLKDNSYTPGLYQTEDKNLMIAVYVDDCVIAASNEQRLDEFINKLKSNFELKITGTLI
                     DDVLDTDILGMDLVYNKRLGTIDLTLKSFINRMDKKYNEELKKIRKSSIPHMSTYKID
                     PKKDVLQMSEEEFRQGVLKLQQLLGELNYVRHKCRYDINFAVKKVARLVNYPHERVFY
                     MIYKIIQYLVRYKDIGIHYDRDCNKDKKVIAITDASVGSEYDAQSRIGVILWYGMNIF
                     NVYSNKSTNRCVSSTEAELHAIYEGYADSETLKVTLKELGEGDNNDIVMITDSKPAIQ
                     GLNRSYQQPKEKFTWIKTEIIKEKIKEKSIKLLKITGKGNIADLLTKPVSASDFKRFI
                     QVLKNKITSQDILASTDY"
– гена, который находится на обратной цепи и имеет хотя бы один интрон.
	gene            complement(445..3311)
                     /locus_tag="YHL050C"
                     /db_xref="GeneID:856335"
     mRNA            complement(join(445..1897,2671..3311))
                     /locus_tag="YHL050C"
                     /product="hypothetical protein"
                     /transcript_id="NM_001179130.1"
                     /db_xref="GI:296145321"
                     /db_xref="GeneID:856335"
     CDS             complement(join(445..1897,2671..3311))
                     /locus_tag="YHL050C"
                     /note="hypothetical protein, potential Cdc28p substrate"
                     /codon_start=1
                     /product="hypothetical protein"
                     /protein_id="NP_011813.1"
                     /db_xref="GI:6321737"
                     /db_xref="SGD:S000001042"
                     /db_xref="GeneID:856335"
                     /translation="MADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPY
                     TVLLANCMIRLGRCGCLNVAPVRNFIEEGCDGVTDLYVGIYDDLASTNFTDRIAAWEN
                     IVECTFRTNNVKLGYLIVDELHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEA
                     VADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSKVPLGTNAT
                     TTASTNVRTSATTTASINVRTSATTTASINVRTSATTTESTNSNTNATTTESTNSSTN
                     ATTTASTNSSTNATTTESTNASAKEDANKDGNAEDNRFHPVTDINKEPYKRKGSQMVL
                     LERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQ
                     KMFELCVCWAGQKVSYRRMAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKFFS
                     VKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYYKVWS
                     NLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVFVEALEKVGVFQRLRSMTSAGLQGPQ
                     YVKLQFSRHHRQLRSRYELSLGMHLRDQLALGVTPSKVPHWTAFLSMLIGLFYNKTFR
                     QKLEYLLEQISEVWLLPHWLDLANVEVLAADNTRVPLYMLMVAVHKELDSDDVPDGRF
                     DIILLCRDSSREVGE"


Получение последовательности, кодирующей заданный белок

По идентификатору моего белка в EMBL нашел последовательность, кодирующую этот белок, в геноме бактерии Pyrococcus furiosus.

К моему белку относится запись

FT   gene            complement(1008047..1008421)
FT                   /locus_tag="PFC_05620"
FT   CDS             complement(1008047..1008421)
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /locus_tag="PFC_05620"
FT                   /product="superoxide reductase"
FT                   /note="COG2033 Desulfoferrodoxin"
FT                   /db_xref="EnsemblGenomes-Gn:PFC_05620"
FT                   /db_xref="EnsemblGenomes-Tr:AFN04065"
FT                   /db_xref="GOA:I6UZH2"
FT                   /db_xref="InterPro:IPR002742"
FT                   /db_xref="UniProtKB/TrEMBL:I6UZH2"
FT                   /protein_id="AFN04065.1"
FT                   /translation="MISETIRSGDWKGEKHVPVIEYEREGELVKVKVQVGKEIPHPNTT
FT                   EHHIRYIELYFLPEGENFVYQVGRVEFTAHGESVNGPNTSDVYTEPIAYFVLKTKKKGK
FT                   LYALSYCNIHGLWENEVTLE"

Видно, что кодирующая последовательность находится на обратной цепи с 1008047 по 1008421 нуклеотид. Воспользовался командой seqret с опцией -sask и получил последовательность гена.

>AFN04065|AFN04065.1 Pyrococcus furiosus COM1 superoxide reductase 
atgattagtgaaaccataagaagtggggactggaaaggagaaaagcacgtccccgttata
gagtatgaaagagaaggggagcttgttaaagttaaggtgcaggttggtaaagaaatcccg
catccaaacaccactgagcaccacatcagatacatagagctttatttcttaccagaaggt
gagaactttgtttaccaggttggaagagttgagtttacagctcacggagagtctgtaaac
ggcccaaacacgagtgatgtgtacacagaacccatagcttactttgtgctcaagactaag
aagaagggcaagctctatgctcttagctactgtaacatccacggcctttgggaaaacgaa
gtcactttagagtga

Дата последнего обновления: 16.02.2015
Copyright © Дюгай Илья, 2014.