-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent HGNC:ID results between single-thread and multi-thread in vep #1759
Comments
Hi @karlestira, Kind regards, |
vcf is from vardict, and some pre-process has been done. cmd: using VEP database download from ftp(sorry I forgot the url) with the name: homo_sapiens_merged_vep_112_GRCh37.tar.gz then: In my system, the diff is between line 62(the cmd line, it is no problem), 4892, 4893, 4987, 4988(these 4 lines is different in HGNC ID when transcript is from refseq) |
I have encountered the same issue, using VEP version 111.0. |
Hi @likhitha-surapaneni, thanks for your help :) I am seeing this issue as well w/ VEP version 111.0. Were you able to replicate with the data from @karlestira, or would it be helpful to provide another example? Even before a patch is applied, it would be awesome if you could comment on what might be causing this issue so that we know if it's specific to the HGNC annotations (which I'm not super concerned about) or a more general issue with forking that may result in other more serious discrepancies? |
@likhitha-surapaneni , thanks for looking into this! After a more thorough comparison between VEP runs with and without forking, I am starting to notice more serious issues than just dropped HGNC identifiers (which affects roughly 1 out of 10,000 SNPs). In particular, about 1 out of 4 structural variants (SVs) is annotated differently between single-thread and multi-thread VEP. The vast majority of these differences (>90%) are instances where multi-thread VEP drops one of several entire CSQ annotations for a variant. Here's an example:
multi-thread VEP: |
Hi @TimD1 , @christopher-hardy, @karlestira Kind regards, |
Describe the issue
vep give different result when using multi-thread(--fork).
problem:
Some gene(like ENSG00000169047 or ENSG00000168769) will loss its refseq HGNC ID(near field EntrezGene) when using --fork, and they are shown in single-thread result.
Additional information
This inconsistent is due to the thread setting, same threads give same results bewteen different running, but different threads setting lead to different result.
I believe this is a multi-thread inconsistent bug. And I think this bug happens widely, Any WES vcf and VEP merged cache can reproduce the problem, no specific inputs need.
System
Full VEP command line
Full error message
No error message.
The text was updated successfully, but these errors were encountered: