A Snakemake workflow for the analysis of NGS reads with SKA (Harris, 2018) to identify the (mosquito) species of different samples.
-
Make sure to have a tab-separated file that contains your sample name in the first column (
sample), trimmed forward read paths in the second column (fq1) and the trimmed reverse read paths in the third (fq2). For an example see thesamples.tsvfile. -
In
config/config.yamlprovide the right path to the genome you want to map your reads to, the tab-separated file with the info on your samples (see above) and the number of threads you want to use. -
Make sure you have snakemake installed. Run the
snakemakepipeline from the top folder of this repo with:
snakemake- The final output is a
ska.distances.tsvfile, which can be used for a cluster analysis to determine the samples' species.
In slurm-profile/config.yaml you can change specific slurm job settings.
Run following command from a tmux window :
snakemake --profile slurm-profile/Snakemake will handle the submitting of jobs and the output removal of failed jobs.