You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi AlexandrovLab,
I was working on some samples when I got some unexpected results. One of these samples seems to have some T/A insertions at homopolymer regions, however when I used SigProfilerMatrixGenerator to compute the mutational matrix there was no mutation belonging to the 1:Ins:T:5 class. I decided to look closely on IGV and it definitely looks like these indels should be considered as 1:Ins:T:5.
Therefore, I went to TCGA to check this issue with other samples. However, the same issue appeared. I tried computing the mutational matrix for the sample TCGA-DM-A1D8. This sample contains 2 insertions that look like they should be classified as 1:Ins:T:5 according to the cBioPortal data from IGV.
But none of them appears to belong to that class when computing the mutational matrix. I used different versions of SigProfilerMatrixGenerator (v1.1, v1.2 and v.1.2.31) as well as the SigProfilerAssignment webtool (https://cancer.sanger.ac.uk/signatures/assignment/app/), but all give the same results.
I find this weird, because around 2 years ago I computed the mutational matrixes for this exact TCGA sample and I do had these 2 indels classified as 1:Ins:T:5 mutations. Moreover, according to the literature this should be one of the most prevalent INDEL types in cohorts and I it is completly absent in some mutational matrixes that I have reciently computed.
As a side not, I manually added 1 bp to the start and end positions of these insertions and then they were called as 1:Ins:T:5 mutations. I was worried this could affect other inserions, but let me know whether you think this would this be a potential solution.
Please let me know if I am doing something wrong. If not I would appreciate if you could let me know whether there is a quick solution or a specific version of the package that does not have this potential issue that I could use in the meantime.
Thank you very much in advance.
For reproducibility, this is the code I have used to compute the mutational matrix (tested in Google Colab and in a Linux based HPC):
pip install SigProfilerMatrixGenerator
from SigProfilerMatrixGenerator.scripts import SigProfilerMatrixGeneratorFunc as matGen
from SigProfilerMatrixGenerator import install as genInstall
genInstall.install('GRCh37', bash=True)
matrices = matGen.SigProfilerMatrixGeneratorFunc("test", "GRCh37", "/content/test/", plot=False, exome=False, bed_file=None, chrom_based=False, tsb_stat=False, seqInfo=False, cushion=100)
Here it is the input file (I just added a .txt extension because otherwise I could not upload it):
Hi AlexandrovLab,
I was working on some samples when I got some unexpected results. One of these samples seems to have some T/A insertions at homopolymer regions, however when I used SigProfilerMatrixGenerator to compute the mutational matrix there was no mutation belonging to the 1:Ins:T:5 class. I decided to look closely on IGV and it definitely looks like these indels should be considered as 1:Ins:T:5.
Therefore, I went to TCGA to check this issue with other samples. However, the same issue appeared. I tried computing the mutational matrix for the sample TCGA-DM-A1D8. This sample contains 2 insertions that look like they should be classified as 1:Ins:T:5 according to the cBioPortal data from IGV.
But none of them appears to belong to that class when computing the mutational matrix. I used different versions of SigProfilerMatrixGenerator (v1.1, v1.2 and v.1.2.31) as well as the SigProfilerAssignment webtool (https://cancer.sanger.ac.uk/signatures/assignment/app/), but all give the same results.
Mutational_Profile_ID.pdf
I find this weird, because around 2 years ago I computed the mutational matrixes for this exact TCGA sample and I do had these 2 indels classified as 1:Ins:T:5 mutations. Moreover, according to the literature this should be one of the most prevalent INDEL types in cohorts and I it is completly absent in some mutational matrixes that I have reciently computed.
As a side not, I manually added 1 bp to the start and end positions of these insertions and then they were called as 1:Ins:T:5 mutations. I was worried this could affect other inserions, but let me know whether you think this would this be a potential solution.
Please let me know if I am doing something wrong. If not I would appreciate if you could let me know whether there is a quick solution or a specific version of the package that does not have this potential issue that I could use in the meantime.
Thank you very much in advance.
For reproducibility, this is the code I have used to compute the mutational matrix (tested in Google Colab and in a Linux based HPC):
Here it is the input file (I just added a .txt extension because otherwise I could not upload it):
TCGA-DM-A1D8_SigProfiler_input.maf.txt
The text was updated successfully, but these errors were encountered: