Genome assembly using Flye
Another assembler that can be used for long-reads such as PacBio and Oxford Nanopore is Flye. In contrast to the minimap and miniasm pipeline Flye also produces a polished consensus sequence for the assembly which significantly reduces the error rate (more about consensus sequences and polishing in the next practicals).
Change into the Flye directory in the assembler_practical folder and run flye on the raw basecalled reads
flye --nano-raw \ ~/course_data/precompiled/all_guppy.fastq \ --genome-size 1m --out-dir ./flye_output
As you can see, flye requires the input reads (–nano-raw) as well as an output directory and the (expected) size of the final assembly which, in this case is set to 1 megabase (1,000,000 bases). The output of flye are several files including the assembly in fasta format.
When Flye is finished use assembly-stats to get a first overview over the finished assembly.
Now align the flye assembly to the reference chromosome using dnadiff
dnadiff –p flye_dnadiff ~/course_data/precompiled/chr17.fasta \ flye_output/assembly.fasta
Open the flye_dnadiff.report file (e.g. double-click on the file).
Now upload the flye_dnadiff.delta to Assemblytics and inspect the dot plots.