Memory issue with current implementation #7

malachig · 2020-08-10T15:50:28Z

I was trying to use this to count a large number of sites (~6 million potential germline SNP positions). Even with 10GB of memory requested my jobs were being killed due to exceeded memory usage.

I believe the issue is that it loads the whole ROI into memory before writing out the files needed by bam-readcount?

chrisamiller · 2020-08-10T19:43:28Z

Yeah, that sounds about right. As a temporary workaround, I imagine you could split by chromosome.

One way to fix this in the code would be to set a maximum number of lines to test at a time, read those, run bam readcount, and loop until the whole file is done.

susannasiebert linked a pull request Sep 23, 2024 that will close this issue

Write regions file as VCF is being processed instead of saving in memory wustl-oncology/analysis-wdls#159

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issue with current implementation #7

Memory issue with current implementation #7

malachig commented Aug 10, 2020

chrisamiller commented Aug 10, 2020

Memory issue with current implementation #7

Memory issue with current implementation #7

Comments

malachig commented Aug 10, 2020

chrisamiller commented Aug 10, 2020