DAPseq_pipeline_nf

A workflow for DAP-seq peak calling and related analysis. Trying to modulize the workflow of DAP analysis to save repetitive work.

The workflow includes the following steps:

Reads trimming (trim_galore)
Clean reads mapping (bowtie2)
Peak calling (MACS3)
Motif analysis (MEME Suite)
Peak annotation (HOMER)
FRIP score calculation

Test the workflow

git clone [email protected]:mmyoung/DAPseq_pipeline_nf.git
nextflow run /path/to/DAPseq_pipeline_nf -params-file params.yml ## yml file saving all parameters, refer to the file in ./test folder for format

Parameters

--fq_sheet A tab-delimited file storing the samples information, with five columns: sample,fq1,fq2,single_end,control
--fasta    Genome fasta file for the analyzing species.
--gtf    Genome gtf file for the analyzing species.
--output_dir    Name for directory for saving the results. Default: ./results
--fq_dir  The folder where the raw .fastq files are.
--gsize The size of analyzing genome.
--bowtie_idx The bowtei2 index directory. ## built with bowtie2-build command
--prime5_trim_len How many bases to trim for the 5' of reads.
--prime3_trim_len How many bases to trim for the 3' of reads.
--gsize The size of analyzing genome.

Requirements

conda environment: DAPseq_env (MACS3 and HOMER and MEME suite installed)
Indexed genome

Caveats:

need to go through the scripts to make sure the path to softwares are executable for current user.

Input, example

fq_sheet.csv

sample,fq1,fq2,single_end,control
IP,SRR27496336_1.fastq,SRR27496336_2.fastq,0,Input
Input,SRR27496337_1.fastq,SRR27496337_2.fastq,0,

Results

├── alignment
│   ├── Input.bowtie2.log
│   ├── Input_map_sorted.bam
│   ├── IP.bowtie2.log
│   └── IP_map_sorted.bam
├── bw_output
│   ├── Input_sorted_bam.bw
│   └── IP_sorted_bam.bw
├── coverage_out
│   ├── Input_base.depth
│   ├── Input_depth.pdf
│   ├── IP_base.depth
│   └── IP_depth.pdf
├── deduplicateion_out
│   ├── Input_dedup_Q20_sorted.bam
│   ├── Input_dedup_Q20_sorted.bam.bai
│   ├── IP_dedup_Q20_sorted.bam
│   └── IP_dedup_Q20_sorted.bam.bai
├── macs3_output
│   ├── IP.annotatePeaks.txt
│   ├── IP_peaks.narrowPeak
│   ├── IP_peaks.xls
│   └── IP_summits.bed
├── meme_output
│   ├── Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.fai
│   ├── IP_meme
│   │   ├── logo1.eps
│   │   ├── logo1.png
│   │   ├── logo_rc1.eps
│   │   ├── logo_rc1.png
│   │   ├── meme.html
│   │   ├── meme.txt
│   │   └── meme.xml
│   └── IP.peak.fasta
└── trimm
    ├── Input_val_1.fq.gz
    ├── Input_val_2.fq.gz
    ├── IP_val_1.fq.gz
    ├── IP_val_2.fq.gz
    ├── SRR27496336_1.fastq_trimming_report.txt
    ├── SRR27496336_2.fastq_trimming_report.txt
    ├── SRR27496337_1.fastq_trimming_report.txt
    └── SRR27496337_2.fastq_trimming_report.txt

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
bin		bin
module		module
subworkflow		subworkflow
test		test
.DS_Store		.DS_Store
README.md		README.md
env_export.yml		env_export.yml
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAPseq_pipeline_nf

The workflow includes the following steps:

Test the workflow

Parameters

Requirements

Input, example

Results

About

Releases

Packages

Languages

mmyoung/DAPseq_pipeline_nf

Folders and files

Latest commit

History

Repository files navigation

DAPseq_pipeline_nf

The workflow includes the following steps:

Test the workflow

Parameters

Requirements

Input, example

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages