-
Notifications
You must be signed in to change notification settings - Fork 204
OTU picking
We use QIIME scripts for OTU picking.
The first step is to create a QIIME mapping file (as described here: http://qiime.org/documentation/file_formats.html).
The bare-bones mapping file can be created with this command:
create_qiime_map.pl non_chimeras/* > map.txt
Then you will need to combine all of the FASTA files into a single file (with sample IDs added to each header line):
add_qiime_labels.py -i non_chimeras/ -m map.txt -c FileInput -o combined_fasta
The next script, pick_open_reference_otus.py, takes in parameters from a "parameter file" as well as on the command line. This is because the script is a wrapper for several other QIIME scripts and you can specify the parameters for these other scripts within the parameter file. We usually just specify two options for the pick_otus.py script:
echo "pick_otus:threads 4" >> clustering_params.txt
echo "pick_otus:sortmerna_coverage 0.8" >> clustering_params.txt
These parameters mean that we want to thread the pick_otus.py script over 4 CPUs and set the minimum percent query alignable coverage to be 80% (for SortMeRNA, as specified below).
Finally we can run the actual OTU picking step, specifying that we will be using SortMeRNA for reference picking and SUMACLUST for de novo OTU picking (~24 hours):
pick_open_reference_otus.py -i $PWD/combined_fasta/combined_seqs.fna -o $PWD/clustering/ -p $PWD/clustering_params.txt -m sortmerna_sumaclust -s 0.1 -v --min_otu_size 1
After OTU picking it can be difficult to distinguish low frequency OTUs from noise. We filter out OTUs that supported by less than 0.1% reads (this is because Illumina reported that 0.1% of reads are expected to be bleed-through from previous runs on an Illumina MiSeq). Note that we retained singletons (OTUs identified by 1 read) in the previous step so that the threshold for 0.1% of reads can be correctly calculated.
remove_low_confidence_otus.py -i $PWD/clustering/otu_table_mc1_w_tax_no_pynast_failures.biom -o $PWD/clustering/otu_table_high_conf.biom
- Please feel free to post a question on the Microbiome Helper google group if you have any issues.
- General comments or inquires about Microbiome Helper can be sent to [email protected].