SnakeMake-Pairtools-Phased

Workflow to run analyze Hi-C data in allelic mode

Requirements

Snakemake > 8.X

All tools are integrated in a conda environment defined in env.yaml, so run with --use-conda flag.

Inputs

Fastq paired-ends files as _1.fastq.gz and _2.fastq.gz
Genome bgzip2 compressed Fasta with index (samtools faidx genome.fa.gz)
VCF with parental SNPs with each parent as a Sample, file compressed with bgzip2 and indexed with tabix

Outputs

01_preprocessing: Fastp reports and trimmed fastq
02_diplod_genome: Diplod genome fasta and BWA indexes
03_mapping: BAM files
04_pairing: pairtools parsing and deduplication
05_filter: pairtools phasing
06_stats: pairtools stats
07_multiqc: multiQC reports

Execution

Clone this repository and enter it
Create a custom config.yml file from the template config.yml.template
Activate your SnakeMake environment
Execute the pipeline as: snakemake -c 10 --configfile /path/to/config.yml -d output_dir --use-conda

The pipeline is mostly linear, but can be used with Slurm or similar with a proper --profile

Workflow

(C) Juan Caballero 2024

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
img		img
scripts		scripts
workflow		workflow
README.md		README.md
config.yml.template		config.yml.template
env.yml		env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SnakeMake-Pairtools-Phased

Requirements

Inputs

Outputs

Execution

Workflow

About

Releases

Packages

Contributors 2

Languages

caballero/snakemake-pairtools-phased

Folders and files

Latest commit

History

Repository files navigation

SnakeMake-Pairtools-Phased

Requirements

Inputs

Outputs

Execution

Workflow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages