Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with VEP query stuck without results (without offline mode) #1806

Open
Taalouane opened this issue Nov 28, 2024 · 6 comments
Open

Problem with VEP query stuck without results (without offline mode) #1806

Taalouane opened this issue Nov 28, 2024 · 6 comments
Assignees

Comments

@Taalouane
Copy link

Hello,

I am contacting you regarding an issue encountered while using VEP to query the Ensembl database.

The query remains stuck, with no results returned and no error message. The process seems to run indefinitely without reaching a final result.

The query was working fine for me previously, and several times. Additionally, I would like to mention that the MySQL connection to the server is functioning normally, as shown by the following command:

mysql --host=ensembldb.ensembl.org --user=anonymous
The connection establishes without any issue and gives me access to the database, confirming that the connection to the server is working. However, the query via VEP remains stuck.

I would appreciate your expertise in identifying whether this problem could be related to a recent configuration change in VEP, or if other factors could explain this behavior after the system restart.

Thank you in advance for your help.

                   singularity exec  \ 
                    $path_SIF/VEPv109.3.sif /opt/vep/src/ensembl-vep/vep \
                 --input_file  ${sample_name}.txt \
                 --output_file  ${sample_name}.vcf \
                 --format hgvs \
                 --cache \
                 --dir_cache $path_cache_dir \
                 --vcf --fork 8 \
                 --buffer_size 1000 \
                 --assembly GRCh38 \
                 --total_length \
                 --no_stats \
                 --sift b \
                 --polyphen b \
                 --hgvs \
                 --hgvsg \
                 --symbol \
                 --no_escape \
                 --numbers \
                 --domains \
                 --regulatory \
                 --protein \
                 --biotype \
                 --variant_class \
                 --check_existing \
                 --pubmed \
                 --canonical \
                 --af \
                 --af_1kg \
                 --dir_plugins $path_plugins_dir \
                 --pick \
                 --quiet  \
                 --force_overwrite \
                 --plugin dbNSFP,$path_resources/dbNSFP_V2a/dbNSFP4.2a.txt.gz,GTEx_V8_tissue,GTEx_V8_gene,MetaLR_score,MetaLR_pred,MetaRNN_score,MetaRNN_pred,MutationTaster_score,MutationTaster_pred,FATHMM_score,FATHMM_pred,PROVEAN_score,PROVEAN_pred,MetaSVM_score,MetaSVM_pred,PrimateAI_score,PrimateAI_pred,ClinPred_score,ClinPred_pred \
                 --plugin ExACpLI \
                 --plugin REVEL,$path_resources/REVEL_v1.3/new_tabbed_revel_grch38.tsv.gz \
                 --plugin ExAC,$path_resources/ExAC_V0.3/ExAC.0.3.GRCh38.vcf.gz \
                 --plugin CADD,$path_resources/CADD_v1.6/whole_genome_SNVs.tsv.gz,${path_resources}/CADD_v1.6/gnomad.genomes.r3.0.indel.tsv.gz \
                 --plugin SpliceAI,snv=$path_resources/SpliceAI_scores_v1.3/spliceai_scores.raw.snv.hg38.vcf.gz,indel=$path_resources/SpliceAI_scores_v1.3/spliceai_scores.raw.indel.hg38.vcf.gz \
                 --custom $path_resources/gnomADg_V3.1.1/gnomad.genomes.v3.1.2.sites.vcf.gz,gnomAD_genomes,vcf,exact,0,AF,popmax,AF_popmax,AF_XX,AF_XY,AF_oth,AF_ami,AF_sas,AF_fin,AF_eas,AF_afr,AF_asj,AF_amr,AF_mid,AF_nfe,nhomalt \
                 --custom $path_resources/gnomADe_V2.1.1/gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz,gnomAD_exomes,vcf,exact,0,AF,popmax,AF_popmax,AF_female,AF_male,AF_afr,AF_amr,AF_asj,AF_eas,AF_fin,AF_nfe,AF_oth,AF_sas,nhomalt \
                 --custom $path_resources/clinvar_v20230702/clinvar_20230702.vcf.gz,CLINVAR,vcf,exact,0,ALLELEID,CLNSIG,CLNDN,CLNREVSTAT,CLNDISDB \
                 --custom $path_resources/PhyloP/hg38.phyloP100way.bw,phylop100verts,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons100way.bw,phastcons100verts,bigwig,exact,0 \
                 --custom $path_resources/PhyloP/hg38.phyloP30way.bw,phylop30mams,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons30way.bw,phastcons30mams,bigwig,exact,0 \
                 --custom $path_resources/PhyloP/hg38.phyloP17way.bw,phyloP17primates,bigwig,exact,0 \
                 --custom $path_resources/PhastCons/hg38.phastCons17way.bw,phastcons17primates,bigwig,exact,0 \
                 2> VEP.log
@dglemos dglemos self-assigned this Nov 28, 2024
@dglemos
Copy link
Contributor

dglemos commented Nov 28, 2024

Hi @Taalouane,
The option --cache still connects to the database which slows down VEP.
Can you please use --offline instead?

Also, to run in offline mode with hgvs options you should provide a fasta file with --fasta (documentation)
We also recommend you download the indexed cache files. You can find them here for the current version 113: https://ftp.ensembl.org/pub/current_variation/indexed_vep_cache/

@dglemos
Copy link
Contributor

dglemos commented Nov 28, 2024

Another reason for the query being slower than usual is ongoing issues with the MySQL connection, which could impact job performance.
If your input file is supported offline, I would recommend you always use --offline instead of --cache.

@Taalouane
Copy link
Author

It is not possible to use offline mode for input with HGVS format, it's not a VCF, it's a list of HGVS variants.
MSG: ERROR: Cannot use HGVS format in offline mode
I added the FASTA file, but the problem persists.

The cache file is indexed, and I have it for the same version of VEP: v109. Do you think I should switch to the new version of VEP and indexed cache?

@Taalouane
Copy link
Author

Additional information: My command was working without any issues before. Is there any problem or maintenance on your server side?

@dglemos
Copy link
Contributor

dglemos commented Nov 28, 2024

It is not possible to use offline mode for input with HGVS format, it's not a VCF, it's a list of HGVS variants.

For HGVS input you have to keep running --cache which establish a connection to the database.

Do you think I should switch to the new version of VEP and indexed cache?

We recommend everyone to use the latest vep code and indexed cache.

Is there any problem or maintenance on your server side?

There are a few ongoing issues with the MySQL connection, which could impact job performance. Please let us know if they persist in the next days.

@Taalouane
Copy link
Author

Thank you @dglemos for your response. I will get back to you.

Could you please forward the information regarding the slow connection to your team responsible for the MySQL server?

Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants