OTU picking

We use QIIME scripts for OTU picking.

The first step is to create a QIIME mapping file (as described here: http://qiime.org/documentation/file_formats.html).

The bare-bones mapping file can be created with this command:

create_qiime_map.pl non_chimeras/* > map.txt

Then you will need to combine all of the FASTA files into a single file (with sample IDs added to each header line):

add_qiime_labels.py -i non_chimeras/ -m map.txt -c FileInput -o combined_fasta

The next script, pick_open_reference_otus.py, takes in parameters from a "parameter file" as well as on the command line. This is because the script is a wrapper for several other QIIME scripts and you can specify the parameters for these other scripts within the parameter file. We usually just specify two options for the pick_otus.py script:

echo "pick_otus:threads 4" >> clustering_params.txt
echo "pick_otus:sortmerna_coverage 0.8" >> clustering_params.txt

These parameters mean that we want to thread the pick_otus.py script over 4 CPUs and set the minimum percent query alignable coverage to be 80% (for SortMeRNA, as specified below).

Finally we can run the actual OTU picking step, specifying that we will be using SortMeRNA for reference picking and SUMACLUST for de novo OTU picking (~24 hours):

pick_open_reference_otus.py -i $PWD/combined_fasta/combined_seqs.fna -o $PWD/clustering/ -p $PWD/clustering_params.txt -m sortmerna_sumaclust -s 0.1 -v --min_otu_size 1

After OTU picking it can be difficult to distinguish low frequency OTUs from noise. We filter out OTUs that supported by less than 0.1% reads (this is because Illumina reported that 0.1% of reads are expected to be bleed-through from previous runs on an Illumina MiSeq). Note that we retained singletons (OTUs identified by 1 read) in the previous step so that the threshold for 0.1% of reads can be correctly calculated.

remove_low_confidence_otus.py -i $PWD/clustering/otu_table_mc1_w_tax_no_pynast_failures.biom -o $PWD/clustering/otu_table_high_conf.biom

Contact

Please feel free to post a question on the Microbiome Helper google group if you have any issues.
General comments or inquires about Microbiome Helper can be sent to [email protected].

Useful Links

Main SOPs

Amplicon SOP v2 (qiime2-amplicon-2024.5)

PacBio Amplicon SOP v2 (qiime2-2022.2)

Metagenomics SOP v3

Wet-Lab SOPs on Protocols.io

Old SOPs

Tutorials

Microbiome for beginners

Metagenomics Resources

mSystems paper data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OTU picking

Contact

Clone this wiki locally