Через SRS я получил список хромосом дрожжей Saccharomyces cerevisiae.
REFSEQ_DNA:NC_001133 NC_001133 Saccharomyces cerevisiae S288c chromosome I, complete sequence. 230218 REFSEQ_DNA:NC_001134 NC_001134 Saccharomyces cerevisiae S288c chromosome II, complete sequence. 813184 REFSEQ_DNA:NC_001135 NC_001135 Saccharomyces cerevisiae S288c chromosome III, complete sequence. 316620 REFSEQ_DNA:NC_001136 NC_001136 Saccharomyces cerevisiae S288c chromosome IV, complete sequence. 1531933 REFSEQ_DNA:NC_001137 NC_001137 Saccharomyces cerevisiae S288c chromosome V, complete sequence. 576874 REFSEQ_DNA:NC_001138 NC_001138 Saccharomyces cerevisiae S288c chromosome VI, complete sequence. 270161 REFSEQ_DNA:NC_001139 NC_001139 Saccharomyces cerevisiae S288c chromosome VII, complete sequence. 1090940 REFSEQ_DNA:NC_001140 NC_001140 Saccharomyces cerevisiae S288c chromosome VIII, complete sequence. 562643 REFSEQ_DNA:NC_001141 NC_001141 Saccharomyces cerevisiae S288c chromosome IX, complete sequence. 439888 REFSEQ_DNA:NC_001142 NC_001142 Saccharomyces cerevisiae S288c chromosome X, complete sequence. 745751 REFSEQ_DNA:NC_001143 NC_001143 Saccharomyces cerevisiae S288c chromosome XI, complete sequence. 666816 REFSEQ_DNA:NC_001144 NC_001144 Saccharomyces cerevisiae S288c chromosome XII, complete sequence. 1078177 REFSEQ_DNA:NC_001145 NC_001145 Saccharomyces cerevisiae S288c chromosome XIII, complete sequence. 924431 REFSEQ_DNA:NC_001146 NC_001146 Saccharomyces cerevisiae S288c chromosome XIV, complete sequence. 784333 REFSEQ_DNA:NC_001147 NC_001147 Saccharomyces cerevisiae S288c chromosome XV, complete sequence. 1091291 REFSEQ_DNA:NC_001148 NC_001148 Saccharomyces cerevisiae S288c chromosome XVI, complete sequence. 948066
Мне была задана восьмая хромосома. Ее длина 562643 нуклеотидов, количество генов и тРНК в ней - 297 и 11 соответственно. Для нее я привожу примеры четырёх генов на заданной хромосоме:
– гена, который находится на прямой цепи и без интронов;
gene 6401..7546
/gene="COS8"
/locus_tag="YHL048W"
/db_xref="GeneID:856337"
mRNA 6401..7546
/gene="COS8"
/locus_tag="YHL048W"
/product="Cos8p"
/transcript_id="NM_001179128.1"
/db_xref="GI:296145324"
/db_xref="GeneID:856337"
CDS 6401..7546
/gene="COS8"
/locus_tag="YHL048W"
/note="Nuclear membrane protein, member of the DUP380
subfamily of conserved, often subtelomerically-encoded
proteins; regulation suggests a potential role in the
unfolded protein response"
/codon_start=1
/product="Cos8p"
/protein_id="NP_011815.1"
/db_xref="GI:6321739"
/db_xref="SGD:S000001040"
/db_xref="GeneID:856337"
/translation="MKENEVKDEKSVDVLSFKQLEFQKTVLPQDVFRNELTWFCYEIY
KSLAFRIWMLLWLPLSVWWKLSSNWIHPLIVSLLVLFLGPFFVLVICGLSRKRSLSKQ
LIQFCKEITEDTPSSDPHDWEVVAANLNSYFYENKTWNTKYFFFNAMSCQKAFKTTLL
EPFSLKKDESAKVKSFKDSVPYIEEALQVYAAGFDKEWKLFNTEKEESPFDLEDIQLP
KEAYRFKLTWILKRIFNLRCLPLFLYYFLIVYTSGNADLISRFLFPVVMFFIMTRDFQ
NMRMIVLSVKMEHKMQFLSTIINEQESGANGWDEIAKKMNRYLFEKKVWNNEEFFYDG
LDCEWFFRRFFYRLLSLKKPMWFASLNVELWPYIKEAQSARNEKPLK"
– гена, который находится на обратной цепи и без интронов;
gene complement(3726..4541)
/locus_tag="YHL049C"
/db_xref="GeneID:856336"
mRNA complement(3726..4541)
/locus_tag="YHL049C"
/product="hypothetical protein"
/transcript_id="NM_001179129.1"
/db_xref="GI:296145322"
/db_xref="GeneID:856336"
CDS complement(3726..4541)
/locus_tag="YHL049C"
/codon_start=1
/product="hypothetical protein"
/protein_id="NP_011814.1"
/db_xref="GI:6321738"
/db_xref="SGD:S000001041"
/db_xref="GeneID:856336"
/translation="MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVR
SFYEDEKSGLIKVVKFRTGAMDRKRSFEKIVVSVMVGKNVQKFLTFVEDEPDFQGGPI
PSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIA
SARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPHMFLLLHVDELSIFSAYQ
ASLPGEKKVDTERLKRDLCPRKPTEIKYFSQICNDMMNKKDRLGDVLHVCCPS"
– гена, который находится на прямой цепи и имеет хотя бы один интрон;
gene 85909..91318
/locus_tag="YHL009W-B"
/db_xref="GeneID:856380"
mRNA join(<85909..86994,86996..>91318)
/locus_tag="YHL009W-B"
/product="gag-pol fusion protein"
/transcript_id="NM_001184404.1"
/db_xref="GI:296145363"
/db_xref="GeneID:856380"
CDS join(85909..86994,86996..91318)
/locus_tag="YHL009W-B"
/EC_number="2.7.7.49"
/EC_number="2.7.7.7"
/EC_number="3.4.23.-"
/EC_number="3.1.26.4"
/ribosomal_slippage
/note="Retrotransposon TYA Gag and TYB Pol genes;
transcribed/translated as one unit; polyprotein is
processed to make a nucleocapsid-like protein (Gag),
reverse transcriptase (RT), protease (PR), and integrase
(IN); similar to retroviral genes"
/codon_start=1
/product="gag-pol fusion protein"
/protein_id="NP_058133.1"
/db_xref="GI:7839180"
/db_xref="SGD:S000007372"
/db_xref="GeneID:856380"
/translation="MATPVRDETRNVIDDNISARIQSKVKTNDTVRQTPSSLRKVSIK
DEQVKQYQRNLNRFKTILNGLKAEEEKLSETDDIQMLAEKLLKLGETIDKVENRIVDL
VEKIQLLETNENNNILHEHIDATGTYYLFDTLTSTNKRFYPKDCVFDYRTNNVENIPI
LLNNFKKFIKKYQFDDVFENDIIEIDPRENEILCKIIKEGLGESLDIMNTNTTDIFRI
IDGLKNKYRSLHGRDVRIRAWEKVLVDTTCRNSALLMNKLQKLVLMEKWIFSKCCQDC
PNLKDYLQEAIMGTLHESLRNSVKQRLYNIPHNVGINHEEFLINTVIETVIDLSPIAD
DQIENSCMYCKSVFHCSINCKKKPNRELGLTRPISQKPIIYKVHRDNNNLSPVQNEQK
SWNKTQKKSNKVYNSKKLVIIDTGSGVNITNDKTLLHNYEDSNRSTRFFGIGKNSSVS
VKGYGYIKIKNGHNNTDNKCLLTYYVPEEESTIISCYDLAKKTKMVLSRKYTRLGNKI
IKIKTKIVNGVIHVKMNELIERPSDDSKINAIKPTSSPGFKLNKRSITLEDAHKRMGH
TGIQQIENSIKHNHYEESLDLIKEPNEFWCQTCKISKATKRNHYTGSMNNHSTDHEPG
SSWCMDIFGPVSSSNADTKRYMLIMVDNNTRYCMTSTHFNKNAETILAQIRKNIQYVE
TQFDRKVREINSDRGTEFTNDQIEEYFISKGIHHILTSTQDHAANGRAERYIRTIVTD
ATTLLRQSNLRVKFWEYAVTSATNIRNCLEHKSTGKLPLKAISRQPVTVRLMSFLPFG
EKGIIWNHNHKKLKPSGLPSIILCKDPNSYGYKFFIPSKNKIVTSDNYTIPNYTMDGR
VRNTQNIYKSHQFSSHNDNEEDQIETVTNLCEALENYEDDNKPITRLEDLFTEEELSQ
IDSNAKYPSPSNNLEGDLDYVFSDVEESGDYDVESELSTTNTSISTDKNKILSNKDFN
SELASTEISISEIDKKGLINTSHIDEDKYDEKVHRIPSIIQEKLVGSKNTIKINDENR
ISDRIRSKNIGSILNTGLSRCVDITDESITNKDESMHNAKPELIQEQFNKTNHETSFP
KEGSIGTNVKFRNTDNEISLKTGDTSLPIKTLESINNHHSNDYSTNKVEKFEKENHHP
PPIEDIVDMSDQTDMESNCQDGNNLKELKVTDKNVPTDNGTNVSPRLEQNIEASGSPV
QTVNKSAFLNKEFSSLNMKRKRKRHDKNNSLTSYELERDKKRSKRNRVKLIPDNMETV
SAQKIRAIYYNEAISKNPDLKEKHEYKQAYHKELQNLKDMKVFDVDVKYSRSEIPDNL
IVPTNTIFTKKRNGIYKARIVCRGDTQSPDTYSVITTESLNHNHIKIFLMIANNRNMF
MKTLDINHAFLYAKLEEEIYIPHPHDRRCVVKLNKALYGLKQSPKEWNDHLRQYLNGI
GLKDNSYTPGLYQTEDKNLMIAVYVDDCVIAASNEQRLDEFINKLKSNFELKITGTLI
DDVLDTDILGMDLVYNKRLGTIDLTLKSFINRMDKKYNEELKKIRKSSIPHMSTYKID
PKKDVLQMSEEEFRQGVLKLQQLLGELNYVRHKCRYDINFAVKKVARLVNYPHERVFY
MIYKIIQYLVRYKDIGIHYDRDCNKDKKVIAITDASVGSEYDAQSRIGVILWYGMNIF
NVYSNKSTNRCVSSTEAELHAIYEGYADSETLKVTLKELGEGDNNDIVMITDSKPAIQ
GLNRSYQQPKEKFTWIKTEIIKEKIKEKSIKLLKITGKGNIADLLTKPVSASDFKRFI
QVLKNKITSQDILASTDY"
– гена, который находится на обратной цепи и имеет хотя бы один интрон.
gene complement(445..3311)
/locus_tag="YHL050C"
/db_xref="GeneID:856335"
mRNA complement(join(445..1897,2671..3311))
/locus_tag="YHL050C"
/product="hypothetical protein"
/transcript_id="NM_001179130.1"
/db_xref="GI:296145321"
/db_xref="GeneID:856335"
CDS complement(join(445..1897,2671..3311))
/locus_tag="YHL050C"
/note="hypothetical protein, potential Cdc28p substrate"
/codon_start=1
/product="hypothetical protein"
/protein_id="NP_011813.1"
/db_xref="GI:6321737"
/db_xref="SGD:S000001042"
/db_xref="GeneID:856335"
/translation="MADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPY
TVLLANCMIRLGRCGCLNVAPVRNFIEEGCDGVTDLYVGIYDDLASTNFTDRIAAWEN
IVECTFRTNNVKLGYLIVDELHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEA
VADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSKVPLGTNAT
TTASTNVRTSATTTASINVRTSATTTASINVRTSATTTESTNSNTNATTTESTNSSTN
ATTTASTNSSTNATTTESTNASAKEDANKDGNAEDNRFHPVTDINKEPYKRKGSQMVL
LERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQ
KMFELCVCWAGQKVSYRRMAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKFFS
VKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYYKVWS
NLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVFVEALEKVGVFQRLRSMTSAGLQGPQ
YVKLQFSRHHRQLRSRYELSLGMHLRDQLALGVTPSKVPHWTAFLSMLIGLFYNKTFR
QKLEYLLEQISEVWLLPHWLDLANVEVLAADNTRVPLYMLMVAVHKELDSDDVPDGRF
DIILLCRDSSREVGE"
По идентификатору моего белка в EMBL нашел последовательность, кодирующую этот белок, в геноме бактерии Pyrococcus furiosus.
К моему белку относится запись
FT gene complement(1008047..1008421) FT /locus_tag="PFC_05620" FT CDS complement(1008047..1008421) FT /codon_start=1 FT /transl_table=11 FT /locus_tag="PFC_05620" FT /product="superoxide reductase" FT /note="COG2033 Desulfoferrodoxin" FT /db_xref="EnsemblGenomes-Gn:PFC_05620" FT /db_xref="EnsemblGenomes-Tr:AFN04065" FT /db_xref="GOA:I6UZH2" FT /db_xref="InterPro:IPR002742" FT /db_xref="UniProtKB/TrEMBL:I6UZH2" FT /protein_id="AFN04065.1" FT /translation="MISETIRSGDWKGEKHVPVIEYEREGELVKVKVQVGKEIPHPNTT FT EHHIRYIELYFLPEGENFVYQVGRVEFTAHGESVNGPNTSDVYTEPIAYFVLKTKKKGK FT LYALSYCNIHGLWENEVTLE"
Видно, что кодирующая последовательность находится на обратной цепи с 1008047 по 1008421 нуклеотид. Воспользовался командой seqret с опцией -sask и получил последовательность гена.
>AFN04065|AFN04065.1 Pyrococcus furiosus COM1 superoxide reductase atgattagtgaaaccataagaagtggggactggaaaggagaaaagcacgtccccgttata gagtatgaaagagaaggggagcttgttaaagttaaggtgcaggttggtaaagaaatcccg catccaaacaccactgagcaccacatcagatacatagagctttatttcttaccagaaggt gagaactttgtttaccaggttggaagagttgagtttacagctcacggagagtctgtaaac ggcccaaacacgagtgatgtgtacacagaacccatagcttactttgtgctcaagactaag aagaagggcaagctctatgctcttagctactgtaacatccacggcctttgggaaaacgaa gtcactttagagtgaДата последнего обновления: 16.02.2015