Nucleotide databases

Quality of a genome assembly

For that task, I decided to choose a Xanthomonas oryzae. It's an organism, containing 2 pathowars, causes a serious blight of rice, other grasses and sedges. Xanthomonas oryzae pathovar oryzicola causes bacterial streak. This disease is common in tropical area and can cause crop losses of up to 32%. Xanthomonas oryzae pathovar oryzae causes bacterial leaf blight which is one of the most serious diseases of rice. This disease is common in temperate and tropical areas and can cause significant crop loss.

Number of assemblies With "Complete Genome" level - 52; With "Scaffold" level - 243; With "Scaffold" level - 111;
Selected assembly (ID in Assembly DB) GCA_001927875.1
Total length 4.26725 Mb
Number of contigs and scaffolds contigs:412 scaffolds:412
N50 21678
L50 63
Number of annotated proteins 4031
Publication link
Contig link

Feature keys

Key Example Description
polyA_site Oryza sativa oryzacystatin mRNA, complete cds
     polyA_site      837
                     /note="31 A nucleotides"
polyA site plays a key role in mRNA transport, translation and stability. In eukaryotes, polyA tail truncates during the mRNA functioning. mRNA with truncated polyA tails destruct with nucleases.
sig_peptide Rice mRNA for branching enzyme-4,complete cds
    sig_peptide
            129..287.
A signal peptide is a part of a protein, that directs it in sufficient organelle.
protein_bind Oryza sativa Japonica Group bio-material IRIS:GID:2254722 Os08g0535200 gene, promoter region
   protein_bind    204..229
                     /locus_tag="Os08g0535200"
                     /bound_moiety="PthXo1"
That key corresponds about protein binding site in DNA.
3'UTR Oryza sativa (japonica cultivar-group) OsCDPK protein mRNA, complete cds
 	3'UTR           1575..>1907
3'UTR is a region of a mature transcript, that are not translated. Usually, it plays a regulatory function.
intron Oryza sativa Japonica Group DNA for glutaredoxin, complete cds
  intron          1583..2354
                     /number=1
An intron is a part of DNA, that removed from RNA during splicing.
variation Oryza sativa Japonica Group Taichung 65 pho1 pseudogene for plastidial starch phosphorylase 1, isolate: BMF136
   variation       2626
                     /gene="pho1"
                     /pseudo
                     /compare=AB441692.1
                     /allele="pho1-1"
                     /replace="g"
Variation Feature Key here means a naturally occurring polymorphism and mutations
stem_loop Oryza punctata b/f complex cytochrome f (petA) gene, partial cds; and PsbJ (psbJ) and PSII L protein (psbL) genes, complete cds; chloroplast
 stem_loop       274..289
Stem-loop is a region, forming by a base-pairing in the same DNA(RNA) strand. Usually, it's necessary for regulation of transcription/translation.

150 Tomato Genome ReSequencing project

Tomato Genome ReSequencing project founded by Wageningen UR, Netherlands, at 4 June 2012. The main aim of the 150 Tomato Genome ReSequencing project is to reveal and explore the genetic variation available in tomato. Tomato has been selected as target crop because it is economically one of the most important species, and is one the most important vegetables globally.Genome project itself. In 2014 they sequenced 84 genomes of different accesions (last publication), and in 2012 they released a public browser to work with pre-publication data. Actual project status do not shown.

Lentinula edodes (shiitake) mitochonrial genome

Lentinula edodes (shiitake) is an edible mushroom native to East Asia, which is cultivated and consumed in many Asian countries.

Edible mushroom

Lentinula edodes in natural habitat

Search was performed in ENA database. Search query: "tax_tree(5353) AND mol_type="genomic DNA" AND topology="CIRCULAR" AND organelle="mitochondrion"" Only one result was found in "Release". AC: AB697988. Here is the excel file with table of mitochondrion proteins, and here is the additional python scripts.


© Gumerov Ruslan, 2017