Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue with current implementation #7

Open
malachig opened this issue Aug 10, 2020 · 1 comment · May be fixed by wustl-oncology/analysis-wdls#159
Open

Memory issue with current implementation #7

malachig opened this issue Aug 10, 2020 · 1 comment · May be fixed by wustl-oncology/analysis-wdls#159

Comments

@malachig
Copy link

I was trying to use this to count a large number of sites (~6 million potential germline SNP positions). Even with 10GB of memory requested my jobs were being killed due to exceeded memory usage.

I believe the issue is that it loads the whole ROI into memory before writing out the files needed by bam-readcount?

@chrisamiller
Copy link
Collaborator

Yeah, that sounds about right. As a temporary workaround, I imagine you could split by chromosome.

One way to fix this in the code would be to set a maximum number of lines to test at a time, read those, run bam readcount, and loop until the whole file is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants