Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature importance? #13

Open
timchu90 opened this issue Mar 19, 2024 · 2 comments
Open

Feature importance? #13

timchu90 opened this issue Mar 19, 2024 · 2 comments

Comments

@timchu90
Copy link

Hi, the SNVstory paper has this line:
“SNVstory includes a feature-importance scheme, unique among open-source ancestral tools, which allows the user to track the ancestral signal broadcast by a given gene or locus.”
I can't seem to find how to do this with the snvstory package. is there a separate package for this gene/locus analysis?
Thanks!

@audrey-bollas
Copy link
Collaborator

Hi! The feature importance is an analysis the user can do with their input data and the gnomad model provided in the resource directory. Since they are built with xgboost you can use SHAP Tree Explainer to do this. In our paper we highlighted a gene-based and a cytolocation-based feature importance analysis. This is not part of the Docker package but I will certainly add the scripts and package environment for you to implement this. I’ll get this up in a day or two. Thanks!

@audrey-bollas
Copy link
Collaborator

Okay, I've added scripts to do the analysis we performed in the paper. The program should export the gene/locus features as a numpy array for you to work with. It can optionally generate the plots from the paper as well. Let me know if you get it working. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants