Skip to content

Remove chimeric reads

Gavin Douglas edited this page Feb 17, 2016 · 24 revisions

Our script wraps usearch (v6.1), specifically the uchime algorithm, to remove chimeric reads.

Here is an example command: -type 1 -db /usr/local/db/single_strand/Bacteria_RDP_trainset15_092015.udb fasta_files/*

Where "-type 1" means that any reads clearly called as chimeric AND reads that are ambiguous are filtered out.

Note that a DB file needs to be input as well. If you'd like to use the UDB format rather than FASTA then you'll need to use the "-makeudb_usearch" function of usearch v6.1 (the same usearch version as used for chimera checking).

Note that it is possible that the settings of "mindiv" and "minh" (see could have significant effects on results. So far we have found that adjusting these parameters has only a subtle effect on sensitivity and specificity when running chimera checking for 16S sequences.


  • -h, --help
    Displays the entire help documentation.

  • -v, --version
    Displays version number and exits.

  • -type <[0|1]>
    Non-chimeric output type, either only sequences that are clearly non-chimeric (1) or all sequences that are not called as chimeric ( 0 - includes borderline sequences, "?" in uchime output).

  • -mindiv
    Min % divergence between query and target sequence (default 1.5, note that this differs from the uchime default of 0.8).

  • -minh
    Min score to be called as chimeric (default 0.2, note that this differs from the uchime default of 0.28).

  • -o, --out_dir
    Output directory for filtered fastq files. Default is "non_chimeras".

  • -thread <# of CPUs>
    Using this option without a value will use all CPUs on machine, while giving it a value will limit to that many CPUs. Without option only one CPU is used.

  • -log
    The location to write the log file.

  • -db, --database
    Database of 16S sequences to use as a reference (UDB or FASTA file).

Clone this wiki locally