Error Correction using Minipolish
Similarly to Racon the tool Minipolish is written specifically to improve minimap2/miniasm assemblies. In fact, minipolish uses Racon to polish miniasm assemblies but in contrast to Racon it produces output files in miniasm’s GFA format instead of in fasta. Additonally, Minipolish was written with a focus on cirular replicons, e.g., bacterial genomes, plasmids and plastids, which it tries to cleanly corcularize, if possible.
Additionaly features/advantages:
- It includes read overage in the output
- rotates circular contigs/unitigs
- Reduces/avoids introduction of gaps into the assembly
Change into the directory minipolish in directory practicals/error_corection_practicals. Use the provided nanofilt trimmed reads and the miniasm assembly to run minipolish
minipolish -t 2 nanofilt_result.fastq miniasm.gfa > minipolished_assembly.gfa
The above command will run minipolish with 2 CPUs (-t 2) and re-direct (>) the output into the file minipolished_assembly.gfa.
As mentioned, one of the features of minipolish is that it writes the result into a GFA files which can be visualised with tools such as Bandage. However, for the moment we will “undo” this advantage and convert the GFA to fasta to compare Minipolish’s performans to that of a pure Racon run.
Use awk to convert the GFA to fasta
awk ’/^S/{print “>”$2”\n”$3}’ miniasm.gfa > miniasm.fasta
Use dnadiff and potentially Assemblytics to compare the minipolish results to the raw assembly and Racon results.
- What changed in the minipolish assembly, what got better, what worse?