Skip to content

Releases: HKU-BAL/ClairS

v0.4.4

18 Nov 12:49

Choose a tag to compare

Updated the ONT and PacBio ssrs model with base quality jittering and more training samples with a wider range of tumor/normal coverages and tumor purities in model training. Performance improved consistently compared with v0.4.3.

v0.4.3

09 Jul 04:08

Choose a tag to compare

Added parsing the model_specific_settings.conf file in the folder of a model and set parameters accordingly. Initially in this version, snv_min_qual= and indel_min_qual= are supported in the configuration file.

v0.4.2

03 Jun 03:03

Choose a tag to compare

Added --snv_min_qual and --indel_min_qual options to independently set the minimum QUAL threshold for SNVs and Indels to be marked as 'PASS', while deprecating the legacy --qual option.

v0.4.1

01 Dec 03:53

Choose a tag to compare

Added ssrs model for PacBio Revio (hifi_revio_ssrs) and illumina (ilmn_ssrs) platforms.

v0.4.0

11 Oct 14:38

Choose a tag to compare

This version is a major update. The new features and benchmarks are explained in a technical note titled “Improving the performance of ClairS and ClairS-TO with new real cancer cell-line datasets and PoN”. A summary of changes:

  1. Starting from this version, ClairS will provide two model types. ssrs is a model trained initially with synthetic samples and then real samples augmented (e.g., ont_r10_dorado_sup_5khz_ssrs), ss is a model trained from synthetic samples (e.g., ont_r10_dorado_sup_5khz_ss). The ssrs model provides better performance and fits most usage scenarios. ss model can be used when missing a cancer-type in model training is a concern. In v0.4.0, four real cancer cell-line datasets (HCC1937/BL, HCC1954/BL, H1437/BL, and H2009/BL) covering two cancer types (breast cancer, lung cancer) published by Park et al. were used for ssrs model training.
  2. Added BQ jittering in model training to address the BQ distribution difference between the training and calling datasets that leads to performance drop.
  3. Added the --indel_min_af option and adjusted the default minimum allelic fraction requirement to 0.1 for Indels in ONT platform.

v0.3.1

16 Aug 14:43
54e0d7a

Choose a tag to compare

  1. Added four options i. --use_heterozygous_snp_in_tumor_sample_and_normal_bam_for_intermediate_phasing, ii. --use_heterozygous_snp_in_normal_sample_and_normal_bam_for_intermediate_phasing, iii. --use_heterozygous_snp_in_tumor_sample_and_tumor_bam_for_intermediate_phasing, and iv. --use_heterozygous_snp_in_normal_sample_and_tumor_bam_for_intermediate_phasing. iii is equivalent to --use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing added in v0.2.0. iv is equivalent to --use_heterozygous_snp_in_normal_sample_for_intermediate_phasing added in v0.2.0. Use normal bam for intermediate phasing was a request from @Sergey Aganezov. When the coverage of normal and tumor are similar, using normal bam for intermediate phasing has negligible difference from using tumor bam in our experiments using HCC1395/BL.
  2. Added --haplotagged_tumor_bam_provided_so_skip_intermediate_phasing_and_haplotagging to use the haplotype information provided in the tumor bam directly and skip intermediate phasing and haplotagging. This option is useful when using ClairS in a pipeline in which the phasing of the tumor bam is done before running ClairS. BAM haplotagged by WhatsHap and LongPhase are accepted.
  3. Bumped up Clair3 dependency to version 1.0.10, LongPhase to version 1.7.3.

v0.3.0

08 Jul 02:19

Choose a tag to compare

  1. Added a module called “verdict” (Option --enable_verdict) to statistically classify a called variant into either a germline, somatic, or subclonal somatic variant based on the CNV profile and tumor purity estimation.
  2. Improved model training speed, reduced model training time cost by about three times.

v0.2.0

04 May 08:16

Choose a tag to compare

  1. Added --use_heterozygous_snp_in_normal_sample_for_intermediate_phasing/--use_heterozygous_snp_in_tumor_sample_for_intermediate_phasing option to support using either heterozygous SNPs in the normal sample or tumor sample for intermediate phasing. The previous versions used in_tumor_sample for phasing. In this new version, when testing with ONT 4kkz HCC1395/BL and using in_normal_sample for intermediate phasing, the SNV precision improved ~2%, while recall remained unchanged. in_normal_sample becomes the default from this version. However, if the coverage of normal sample is low, please consider switching back to using in_tumor_sample (#22, idea contributed by the longphase team @sloth-eat-pudding).
  2. Added --use_heterozygous_indel_for_intermediate_phasing to include high quality heterozygous Indels for intermediate phasing. With this new option, the haplotagged tumor reads increased by ~3% in ONT 4khz HCC1395/BL, the option becomes default from this version.
  3. Added a model that might provide a slightly better performance for liquid tumor. In this release, only ONT Dorado 5khz HAC for liquid tumor (-p ont_r10_dorado_hac_5khz_liquid) is provided. The model was trained with slightly higher normal contamination. We are testing out the new model with collaborator.
  4. Added --use_longphase_for_intermediate_haplotagging option to replace WhatsHap haplotagging by LongPhase haplotagging to speed up read haplotagging process, the option becomes default from this version.
  5. Bumped up Clair3 dependency to version 1.0.7, LongPhase to version 1.7.

v0.1.7

26 Jan 07:50
d1c5096

Choose a tag to compare

  1. Added ONT Dorado 5khz HAC (-p ont_r10_dorado_hac_5khz) and Dorado 4khz HAC (-p ont_r10_dorado_hac_4khz) model, renamed all ONT Dorado SUP model, check here for more details.
  2. Enabled somatic variant calling in sex chromosomes.
  3. Added FAU, FCU, FGU, FTU, RAU, RCU, RGU, and RTU tags.

v0.1.6

18 Sep 11:45

Choose a tag to compare

  1. Fixed an output bug that caused no VCF output if no Indel candidate was found (contributor @Khi Pin).
  2. Fixed showing incorrect reference allele depth at a deletion region.
  3. Added PacBio HiFi quick demo.