-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
explain major and minor cn for pyclone input #40
Comments
Thanks to the authors for the great tool! I would also love some clarification on correct inputs in regions with no detected somatic CNV. Should these perhaps be filtered and a user can manually supply CCF = VAF*normal_cn/tumor_purity? I am not an author, but in case it's useful my 2 cents on the above issue: I think major_cn = 1, minor cn = 1 is the correct approach for autosomes when there is no detected CNV. I think when you give minor_cn = 0 for autosomes you are implying loss of heterozygosity has happened in this region. This means with 100% tumor purity you would expect a VAF of 1. You didn't specify your tumor purity here, but suspect I it's very high. I'm guessing pyclone is struggling to reconcile VAF ~ 0.5 (for the first mutation) with the high tumor purity in this scenario, thus the high CCF uncertainty (cellular_prevalence_std). In contrast, when you give minor_cn = 1, you get a much more confident (and, I think, correct) answer (much lower stdev), because pyclone is no longer confused by the low VAFs combined with the high tumor purity. For tumor purity ~ 100% you get cellular prevalence ~ VAF*2, which you are seeing in all of your results. |
Hi ajw2329, From a biological perspective, I fully agree with your point. If a somatic variant occurs in a fully diploid locus with tumor purity set to 1 (on a scale from 0 to 1), the cancer cell fraction (CCF) should indeed be twice the variant allele frequency (VAF). To address this, I’ve already attempted to adjust inputs in regions without copy number variation (CNV) by applying the formula VAF * normal copy number * tumor purity, assuming both VAF and tumor purity range from 0 to 1. For some samples, I lack specific information on tumor purity, but since these are derived from leukemic blasts, assuming a tumor purity near 1 may be a reasonable approximation. My focus, however, is on reconstructing clonal architectures from bulk DNA targeted NGS sequencing, which emphasizes accurate cluster assignment. Given this, I’m confused about cases where, in a fully diploid locus (with minor copy number = 1 and major copy number = 1), two high-VAF variants and one low-VAF variant are grouped into the same cluster. Setting minor copy number = 0 and major copy number = 2 seems to better represent this scenario, as default values in some paper I read. Another question is: what types of CNVs are suitable for reconstructing clonal architecture? My data come from a targeted panel of 45 genes, so these smaller CNVs might not be ideal for inferring genomic DNA gains or losses. I am also unclear on how PyClone uses copy number information to adjust VAF and infer clone assignment. Thanks a lot, Best Regards, Alessio |
Hi, I'm new to pyclone and I would like to use it to inferring AML clonal architecture from tumor-only targeted DNA sequencing samples.
I'm not sure how to get information about major and minor cn required to run pyclone. I have tried the following workflow on one sample to setting the pipeline:
First, I call SNV and use cnvkit to perform copy number analysis and there is no alteration in CN profile in the SNV regions previously identified.
So I put major_cn= 1 and minor_cn = 1, even if is difficult to infer allele-specific CN without matched normal sample. Infact, in some paper I saw major=2 and minor=0, what's the difference?
I try to run both with different results:
-Major_cn=2; minor_cn=0
mutation_id sample_id cluster_id cellular_prevalence cellular_prevalence_std variant_allele_frequency
chr19:13054571 pyclone.M2m0.input 1 0.7557546548022801 0.23054355560178952 0.4852941176470588
chr1:43815008 pyclone.M2m0.input 1 0.5475871189192091 0.160503139600136 0.3264183561213264
chr20:31022288 pyclone.M2m0.input 0 0.06789026894828525 0.027398939127064446 0.03896961690885073
-Major_cn=1; minor_cn=1
mutation_id sample_id cluster_id cellular_prevalence cellular_prevalence_std variant_allele_frequency
chr19:13054571 pyclone.input 0 0.9646000732189531 0.023390848092490125 0.4852941176470588
chr1:43815008 pyclone.input 0 0.6535034427688614 0.028133163070667364 0.3264183561213264
chr20:31022288 pyclone.input 0 0.0770851350192494 0.011144088857009335 0.03896961690885073
Someone can explain me the difference and how I can calculated major and minor cn with this type of data ?
Thanks,
Best regards
The text was updated successfully, but these errors were encountered: