Read trimming and filtering using NanoFilt

Trimming parts of a sequence can improve your data. Additionally, it can be beneficial to remove short read especially for whole genome sequencing projects. As usual there are many different tools to approach read trimming/fitlering. As mentioned, a common tool for Illumina data is Trimmomatic. In this tutorial we will use the open-source tool NanoFilt to further trim and filter our MinION reads.

Create directory for your NanoFilt output called nanofilt in the trimming_practical folder, change into it and

mkdir ~/course_data/practicals/trimming_practical/nanofilt

cd ~/course_data/practicals/trimming_practical/nanofilt

NanoFilt –l 500 --headcrop 10 < ../porechop/porechopped.fastq \
 > ./nanofilt_trimmed.fastq


The “\” at the end of each line is only for convenience to write a long command into several lines. It tells the command-line that all lines still belong together although the are separated by “enter” keys. However, if you type all of the command, i.e., paths etc, in one line don’t’ use the backslash at the end of the lines.

NanoFilt does not provide options for input or output files. Therefore we will use the two redirect operators “>” and “<“ to

Use FastQC to check the result file and compare it to the porechopped file and the original guppy fastq. Did NanoFIlt improve the data?

Answer


Options heacrop and –l are just a few of the possible filter options of NanoFilt. It also provides options for filtering based on quality scores or maximum read length. For a complete list of all options type NanoFilt -h