Skip to content

LanderDC/SKA-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Snakemake workflow: SKA-analysis

Snakemake

A Snakemake workflow for the analysis of NGS reads with SKA (Harris, 2018) to identify the (mosquito) species of different samples.

Usage

Quick usage

  1. Make sure to have a tab-separated file that contains your sample name in the first column (sample), trimmed forward read paths in the second column (fq1) and the trimmed reverse read paths in the third (fq2). For an example see the samples.tsv file.

  2. In config/config.yaml provide the right path to the genome you want to map your reads to, the tab-separated file with the info on your samples (see above) and the number of threads you want to use.

  3. Make sure you have snakemake installed. Run the snakemake pipeline from the top folder of this repo with:

snakemake
  1. The final output is a ska.distances.tsv file, which can be used for a cluster analysis to determine the samples' species.

Advanced Usage (SLURM)

In slurm-profile/config.yaml you can change specific slurm job settings.

Run following command from a tmux window :

snakemake --profile slurm-profile/

Snakemake will handle the submitting of jobs and the output removal of failed jobs.

About

Perform Split K-mer Analysis to identify sample species (based on k-mers in NGS reads).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published