You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your great work.
I have been testing this tool the last couple of days and wondering if there are optimal sites to select /subset for the input vcf ?
A large vcf would require lots of memory and therefore a minimal size vcf input that contains optimal markers would be great.
would appreciate your advice on this. I believe one other user asked a similar question in the issues.
thanks again
The text was updated successfully, but these errors were encountered:
Hi @aymanm! That is an interesting idea, and one I don't believe we've explored thoroughly (@audrey-bollas correct me if I'm wrong). I think you could technically pull this off, but you'd need to compute feature importance on the training data and then take the n-top (e.g. 100 variants) features for each population and filter your vcfs doing that. Likely, the model would still be performant, especially with WGS data. WES you might lose some accuracy as you'd be at the mercy of the WES kit having probes that cover those variants.
Hello @andreirajkovic! I had a similar question to @aymanm. I have a very large VCF (914GB) and was wondering if there was a suggested course of action for this. I have a 126 GB memory system and was still not able to get it to run.
Thanks for your great work.
I have been testing this tool the last couple of days and wondering if there are optimal sites to select /subset for the input vcf ?
A large vcf would require lots of memory and therefore a minimal size vcf input that contains optimal markers would be great.
would appreciate your advice on this. I believe one other user asked a similar question in the issues.
thanks again
The text was updated successfully, but these errors were encountered: