de novo assembly

Command table

Command name Command function Result
cat *.fa >> adapters.fa Creating the file with adapters adapters.fa
java -jar /nfs/srv/databases/ngs/suvorova/trimmomatic/trimmomatic-0.30.jar SE -phred33 SRR4240380.fastq SR_ad.fastq ILLUMINACLIP:adapters.fa:2:7:7 Removing the adapters Input Reads: 5217318 Surviving: 5119139 (98,12%) Dropped: 98179 (1,88%)
java -jar /nfs/srv/databases/ngs/suvorova/trimmomatic/trimmomatic-0.30.jar SE -phred33 SR_ad.fastq SR_ad_trimmed.fastq TRAILING:20 MINLEN:30 Removing bad quality ends Input Reads: 5119139 Surviving: 4879707 (95,32%) Dropped: 239432 (4,68%)
velveth SR 29 -short -fastq SR_ad_trimmed.fastq Creating the k-mers Directory with output files
velvetg SR Creating the assemblies, through the use of k-mers and de Bruijn graph Output files with stats, contigs etc.

N50 length - 18128, the longest contigs: 57469 (coverage - 35.820582), 43960 (coverage - 36.274227), 33034 (coverage - 35.383968). Median coverage was about 8 reads. Enormously big coverage - ID:721, coverage - 951542, ID:271, length = 1, coverage - 5403. Enormously small - ID: 873, length = 1, coverage - 1, ID:869, length = 4, coverage - 1.

megablast analysis

The longest contigs were analysed with megablast tool. Results are presented below.

Contig with 57469 bp length, aligned on 501913-555905 region of Buchnera aphidicola chromosome

Contig with 43960 bp length, aligned on different parts of a genome. Should be a result of an error in de Bruijn graph construction?

Contig with 33034 bp length, aligned on 451729-480660 region of Buchnera aphidicola chromosome


© Gumerov Ruslan, 2017