-
Notifications
You must be signed in to change notification settings - Fork 56
Description
I'm trying to run somaticseq_parallel on some samples VCFs to call the AI consensus.
The version for SomaticSeq is SomaticSeq v3.7.3. Version of XGBOOST is 2.0.2
I've run all mutation callers, then, with the VCF files, did the following command:
somaticseq_parallel.py --classifier-snv /scratch4/nsobrei2/ggama1/training/somaticseq/ai_model_titration_ffpe_wgs_synth/SNV_model.classifier --classifier-indel /scratch4/nsobrei2/ggama1/training/somaticseq/ai_model_titration_ffpe_wgs_synth/INDEL_model.classifier --output-directory /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR --genome-reference /scratch4/nsobrei2/references/ncbi_grch38_cipher/GRCh38_full_analysis_set_plus_decoy_hla.fa -dbsnp /scratch4/nsobrei2/references/dbsnp/138_cipher/Homo_sapiens_assembly38.dbsnp138.vcf.gz --threads 38 paired --tumor-bam-file /scratch4/nsobrei2/ggama1/germline-tumor/bams/BH12847_1_TUMOR.bam --normal-bam-file /scratch4/nsobrei2/ggama1/germline-tumor/bams/BH12847_1_GERMLINE.bam --mutect2-vcf /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.MuTect2.vcf.gz --vardict-vcf /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarDict.vcf.gz --somaticsniper-vcf /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.SomaticSniper.vcf.gz --muse-vcf /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.MuSE.vcf.gz --strelka-snv /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.Strelka.snv.vcf.gz --strelka-indel /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.Strelka.indel.vcf.gz --varscan-snv /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarScan2.snv.vcf.gz --varscan-indel /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarScan2.indel.vcf.gz --lofreq-snv /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.LoFreq.snv.vcf.gz --lofreq-indel /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.LoFreq.indel.vcf.gz
This is the output with the error
INFO 2024-01-29 21:25:59,514 SomaticSeq SomaticSeq Input Arguments: output_directory=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR, genome_reference=/scratch4/nsobrei2/references/ncbi_grch38_cipher/GRCh38_full_analysis_set_plus_decoy_hla.fa, truth_snv=None, truth_indel=None, classifier_snv=/scratch4/nsobrei2/ggama1/training/somaticseq/ai_model_titration_ffpe_wgs_synth/SNV_model.classifier, classifier_indel=/scratch4/nsobrei2/ggama1/training/somaticseq/ai_model_titration_ffpe_wgs_synth/INDEL_model.classifier, pass_threshold=0.5, lowqual_threshold=0.1, algorithm=xgboost, homozygous_threshold=0.85, heterozygous_threshold=0.01, minimum_mapping_quality=1, minimum_base_quality=5, minimum_num_callers=0.5, dbsnp_vcf=/scratch4/nsobrei2/references/dbsnp/138_cipher/Homo_sapiens_assembly38.dbsnp138.vcf.gz, cosmic_vcf=None, inclusion_region=None, exclusion_region=None, threads=38, somaticseq_train=False, seed=0, tree_depth=12, iterations=None, features_excluded=[], extra_hyperparameters=None, keep_intermediates=False, tumor_bam_file=/scratch4/nsobrei2/ggama1/germline-tumor/bams/BH12847_1_TUMOR.bam, normal_bam_file=/scratch4/nsobrei2/ggama1/germline-tumor/bams/BH12847_1_GERMLINE.bam, tumor_sample=TUMOR, normal_sample=NORMAL, mutect_vcf=None, indelocator_vcf=None, mutect2_vcf=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.MuTect2.vcf.gz, varscan_snv=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarScan2.snv.vcf.gz, varscan_indel=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarScan2.indel.vcf.gz, jsm_vcf=None, somaticsniper_vcf=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.SomaticSniper.vcf.gz, vardict_vcf=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.VarDict.vcf.gz, muse_vcf=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.MuSE.vcf.gz, lofreq_snv=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.LoFreq.snv.vcf.gz, lofreq_indel=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.LoFreq.indel.vcf.gz, scalpel_vcf=None, strelka_snv=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.Strelka.snv.vcf.gz, strelka_indel=/scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/vcf_per_sample/extracted_vcf/kids_first/unsorted/BH12847_1_TUMOR.Strelka.indel.vcf.gz, tnscope_vcf=None, platypus_vcf=None, arbitrary_snvs=[], arbitrary_indels=[], which=paired
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
***** WARNING: File /scratch4/nsobrei2/ggama1/germline-tumor/cavatica/somaticseq/consensus_AI/kids_first/BH12847_1_TUMOR/38.th.input.bed has inconsistent naming convention for record:
HLA-A*01:01:01:01 0 3503
2024-01-29 21:29:24,802 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:29:24,802 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:29:43,208 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:29:43,208 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:29:55,957 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:29:55,957 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:29:57,641 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:29:57,641 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:00,880 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:00,880 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:03,324 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:03,324 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:05,665 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:05,665 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:05,670 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:05,670 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:06,451 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:06,451 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:07,968 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:07,968 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:08,179 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:08,179 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:08,784 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:08,784 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:17,032 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:17,032 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:17,879 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:17,879 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:17,993 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:17,993 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:18,751 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:18,751 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:23,687 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:23,687 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:24,247 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:24,247 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:24,306 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:24,306 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:25,604 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:25,604 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:26,489 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:26,489 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:26,632 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:26,632 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:27,644 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:27,644 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:27,884 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:27,884 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:28,425 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:28,425 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:28,616 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:28,616 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:29,069 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:29,069 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:29,767 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:29,767 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:30,179 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:30,179 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:30,292 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:30,292 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:30,705 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:30,705 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:30,930 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:30,930 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:31,435 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:31,435 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:31,742 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:31,742 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:31,956 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:31,956 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:32,202 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:32,202 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:34,058 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:34,058 somatic_vcf2tsv.py NO RE-SCALING
2024-01-29 21:30:36,062 - somatic_vcf2tsv.py - INFO - NO RE-SCALING
INFO 2024-01-29 21:30:36,062 somatic_vcf2tsv.py NO RE-SCALING
INFO 2024-01-29 22:26:09,775 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:26:09,775 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:43:11,993 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:43:11,993 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:44:54,332 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:44:54,332 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:53:46,696 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:53:46,696 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:57:05,534 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:57:05,534 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:57:38,264 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:57:38,264 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 22:58:31,952 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 22:58:31,952 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:00:51,089 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:00:51,089 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:03:32,194 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:03:32,194 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:04:09,206 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:04:09,206 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:07:29,075 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:07:29,075 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:08:31,220 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:08:31,220 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:08:58,106 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:08:58,106 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:09:43,311 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:09:43,311 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:10:37,123 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:10:37,123 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:11:15,132 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:11:15,132 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:14:03,066 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:14:03,066 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:15:32,899 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:15:32,900 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:18:17,799 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:18:17,799 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:18:40,118 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:18:40,118 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:19:20,634 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:19:20,634 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:21:47,766 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:21:47,766 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:30:41,076 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:30:41,076 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:31:09,867 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:31:09,868 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:31:36,892 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:31:36,892 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:32:03,360 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:32:03,361 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:33:42,153 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:33:42,153 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:34:06,125 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:34:06,126 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:45:14,909 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:45:14,909 xgboost_predictor Number of trees to use = 100
INFO 2024-01-29 23:54:18,994 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-29 23:54:18,994 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:02:30,329 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:02:30,329 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:02:43,281 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:02:43,281 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:04:03,375 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:04:03,375 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:07:53,272 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:07:53,272 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:20:54,193 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:20:54,193 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:26:38,802 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:26:38,802 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:29:20,286 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:29:20,286 xgboost_predictor Number of trees to use = 100
INFO 2024-01-30 00:38:09,574 xgboost_predictor Columns removed for prediction: CHROM,POS,ID,REF,ALT,Strelka_QSS,Strelka_TQSS,if_COSMIC,COSMIC_CNT,TrueVariant_or_False
INFO 2024-01-30 00:38:09,575 xgboost_predictor Number of trees to use = 100
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/ggama1/.conda/envs/somaticseq/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/home/ggama1/.conda/envs/somaticseq/lib/python3.11/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
^^^^^^^^^^^^^^^^
File "/home/ggama1/programs/somaticseq/somaticseq/somaticseq_parallel.py", line 84, in runPaired_by_region
run_somaticseq.runPaired(
File "/home/ggama1/programs/somaticseq/somaticseq/run_somaticseq.py", line 169, in runPaired
modelPredictor(ensembleSnv, classifiedSnvTsv, algo, classifier_snv, iterations=iterations, features_to_exclude=features_excluded)
File "/home/ggama1/programs/somaticseq/somaticseq/run_somaticseq.py", line 87, in modelPredictor
somatic_xgboost.predictor(classifier, input_file, output_file, non_features, iterations)
File "/home/ggama1/programs/somaticseq/somaticseq/somatic_xgboost.py", line 173, in predictor
scores = xgb_model.predict(dtest, ntree_limit=iterations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Booster.predict() got an unexpected keyword argument 'ntree_limit'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ggama1/.conda/envs/somaticseq/bin/somaticseq_parallel.py", line 7, in <module>
exec(compile(f.read(), __file__, 'exec'))
File "/home/ggama1/programs/somaticseq/somaticseq/somaticseq_parallel.py", line 308, in <module>
subdirs = pool.map(runPaired_by_region_i, bed_splitted)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ggama1/.conda/envs/somaticseq/lib/python3.11/multiprocessing/pool.py", line 367, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ggama1/.conda/envs/somaticseq/lib/python3.11/multiprocessing/pool.py", line 774, in get
raise self._value
TypeError: Booster.predict() got an unexpected keyword argument 'ntree_limit'
The output of the created AI model, used in the above code, was:
INFO 2024-01-27 08:52:51,190 xgboost_builder Columns removed before training: CHROM, POS, ID, REF, ALT, Strelka_QSS, Strelka_TQSS, if_COSMIC, COSMIC_CNT, TrueVariant_or_False
INFO 2024-01-27 08:52:51,190 xgboost_builder Number of boosting rounds = 1000
INFO 2024-01-27 08:52:51,191 xgboost_builder Hyperparameters: max_depth=8, nthread=48, objective=binary:logistic, seed=0, tree_method=hist, grow_policy=lossguide
/home/ggama1/.conda/envs/somaticseq/lib/python3.11/site-packages/xgboost/core.py:160: UserWarning: [09:07:04] WARNING: /workspace/src/c_api/c_api.cc:1240: Saving into deprecated binary model format, please consider using `json` or `ubj`. Model format will default to JSON in XGBoost 2.2 if not specified.
warnings.warn(smsg, UserWarning)