Releases: moiexpositoalonsolab/grenepipe
grenepipe v0.13.2
grenepipe v0.13.1
Minor update for some quality of life after the last big release of v0.13.0
Notable Changes
- Add extra snakemake logging to work around Snakemake bug (snakemake/snakemake#2974)
- Add backwards compatibility with the
config.yaml
of grenedalf pre-v0.13.0
grenepipe v0.13.0
This is a major update of grenepipe, as conda suddenly broke backwards compatibility, and so nothing was working any more. We used this opportunity to updated almost all tools in the pipeline to their most recent versions, and also added some new features. Furthermore, we restructured the internal files for compliance with the Snakemake workflow catalog, and restructured the output files of the pipeline for more convenience and clarity.
Major Changes
- Upgrade from Snakemake v6.0.5 to v8.15.2 by default
- python 3.7.10 → 3.12
- pandas 1.3.1 → 2.2.2
- numpy 1.21.2 → 2.0.0
- Upgrade tools:
- adapterremoval 2.3.1 → 2.3.3
- bcftools 1.16 → 1.20
- bowtie2 2.4.1 → 2.5.4
- bwa 0.7.17 → 0.7.18
- cutadapt 2.10 → 4.9
- fastqc 0.11.9 → 0.12.1
- freebayes 1.3.1 → 1.3.7
- gatk4 4.1.4.1 → 4.5.0.0
- mapdamage2 2.2.1 → 2.2.2.2
- multiqc 1.10.1 → 1.22.3
- picard 2.27.4 → 3.2.0
- qualimap 2.2.2d → 2.3.0
- samtools 1.16.1 → 1.20
- seqkit 2.2.0 → 2.8.2
- snpeff 4.3.1t → 5.2
- vep ensembl 104 → 112
- Restructure all output files for more user convenience. See wiki for details.
New Features
- Add variant calling from bam files workflow (instead of starting from fastq files) #47
- Add automatic download of reference genome and known references #41
- Make trimming tool optional #35
Bugfixes
grenepipe v0.12.2
grenepipe v0.12.1
Notable Changes
- Add single end mode to generate table script
- Add java options to picard tools
- Replace GATK by Picard for CreateSequenceDictionary rule
- Fix minor platform issue with empty cells in samples table
grenepipe v0.12.0
This release restructures the merging of mapped bam files prior to variant calling, see Processing of the mapped reads for details. It has breaking changes in the config file (e.g., renaming the entry for the samples table to data: samples-table
, see below), so you will need to use the new config file to start an analysis.
Furthermore, some tool versions were updated (although those should be non-breaking changes), and the overall robustness of the conda environments has been greatly increased, in particular for running on MacOS. All environments work with conda and with mamba now, but we still strongly recommend to use mamba; in our tests, conda needs 4h, and mamba 15min to install all environments.
Notable Changes
- Change
data: samples
todata: samples-table
in the config file, and change some tool parameter keys - Rework mapped read merging to occur before all other bam processing
- Fix bam read group ID tag to match sample units. The ID was equal to the sample name before; now with the reworked merging of units, we also use the unit in the read group ID
- Add further config validations of the samples table
- Improved tests and CI, testing on Ubuntu and MacOS now, with conda and mamba
Bug Fixes
- Fix and update several conda environments, in particular for picard and qualimap, which were not solving with
conda
before. This fixes #25 and #11. As a result, some tools have been upgraded:- bcftools 1.13 → 1.16
- picard 2.20.1 → 2.27.4
- samtools 1.12 → 1.16.1
- vcflib 1.0.2 → 1.0.3
- Fix #29, correctly set bwa mem2 threads
grenepipe v0.11.1
Bug Fixes
- Fix temp directory in
samtools sort
forbwa mem2
andbwa aln
mapping tools #28
grenepipe v0.11.0
Getting closer to v1.0.0! For now, this and the next couple of releases will still have some changes in the config.yaml
that are not backwards compatible though, but hoping to make it future proof in the long run that way.
Notable Changes
- De-indent reference genome and known variants in config
- Switch to our improved version of HAF-pipe (petrov-lab/HAFpipe-line#9)
- Add HAF-pipe per-sample concatenated tables
- Add flag to turn off bcftools stats for avoiding variant calling to get MultiQC report
- Add target for per-sample merged bam output
- Add bcftools filter tool for VCF filtering
- Add dbsnp option to all GATK tools, rename GATK config params
- Add interleaved fastq test to generate table script for more robustness
Bug Fixes
- Fix samtools tmp dir creation issues throughout the pipeline
- Fix typo vqrs instead of vqsr
- Fix GATK helper function location
- Fix vep plugin download issue
- Fix mapdamage conda env
- Fix vcf index file duplication
- Fix missing R package for MapDamage
- Add flag files to avoid incomplete job executions
grenepipe v0.10.0
This is the release version accompanying the grenepipe publication:
grenepipe: A flexible, scalable, and reproducible pipeline
to automate variant calling from sequence reads.
Lucas Czech and Moises Exposito-Alonso. Bioinformatics. 2022.
doi:10.1093/bioinformatics/btac600 [pdf]
Notable Changes
- Add HAF-pipe rules for computing haplotype-based allele frequencies
- Add proper 1001g-based known-variants file as an example and for testing
- Add copy samples script
Bug Fixes
- Fix hard filter name duplication in filtered VCF
- Fix bcftools sample sorting order issue
- Ungroup GATK HaplotypeCaller merging
- Ungroup filter merge step
- Add progressbar and termcolor python dependencies to grenepipe env
grenepipe v0.9.0
Notable Changes
- Properly implement GATK Variant Quality Score Recalibration (VQSR)
- Add
bcftools call
for individual samples instead of combined calling - Add read clipping with BamUtil
- Add SeqKit for reporting statistics of the reference genome
- Add
bcftools stats
reporting on the finalvcf
- Add settings for keeping/removing intermediate files, to save disk storage
Bug Fixes
- Fix all issues related to
samtool sort
needing a temporary directory to properly run - Fix using optional mapping steps in combination with DeDup
- Fix fastqc log output, which was not properly reported