Skip to content

Combine files from Hidden State Prediction PICRUSt2‐MPGA database

Robyn Wright edited this page Jan 13, 2025 · 1 revision

This script is new for PICRUSt2 v2.6.0 with the new PICRUSt2-MPGA database. You can see full details on the updates made to the PICRUSt2 database here.

This script takes in two output tables containing predicted counts from two different domains and combines them into one file that can be used for metagenome prediction.

It can be run for EC numbers and KOs like this:

combine_domains.py --table_dom1 bac_EC_predicted.tsv.gz --table_dom2 arc_EC_predicted.tsv.gz -o combined_EC_predicted.tsv.gz

combine_domains.py --table_dom1 bac_KO_predicted.tsv.gz --table_dom2 arc_KO_predicted.tsv.gz -o combined_KO_predicted.tsv.gz

The options are:

  • --table_dom1 FUNC_PREDICTED.tsv.gz - Output predicted functional abundances for all study sequences for the first domain (it doesn't matter which order the files go into this script, but by default this would be the table for bacteria)
  • --table_dom2 FUNC_PREDICTED.tsv.gz - As above for table_dom1 but for the second domain (by default this would be the table for archaea)
  • -o FUNC_PREDICTED.tsv.gz - File containing a combination of predictions for both domains given above.
Clone this wiki locally