-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reason for low coverage output for deepseq data #85
Comments
According to this issue, specifying a region (and reference) is a workaround for this problem: #3 (Counting every base in the entire reference is going to be somewhat slow anyway, so splitting may sometimes be a good idea) What happens if you run |
Hey Chris,
I tried selecting a defined region but it did not generate any output. Output was 0 bytes.
How should I proceed?
…-Yasha
________________________________
From: Chris Miller ***@***.***>
Sent: Saturday, September 25, 2021 12:44 AM
To: genome/bam-readcount ***@***.***>
Cc: Yasha Nazir Butt ***@***.***>; Author ***@***.***>
Subject: Re: [genome/bam-readcount] Reason for low coverage output for deepseq data (#85)
[EXTERNAL]
According to this issue, specifying a region (and reference) is a workaround for this problem: #3<#3> (Counting every base in the entire reference is going to be somewhat slow anyway, so splitting may sometimes be a good idea)
What happens if you run bam-readcount -w0 -f mm9.fa -d1000000000 aligned_sorted_UNGKO_stim.bam chr1:1-999999999 (or actually substitute the length of the chr)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#85 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AM3AU2XQP2OXQZ5WH5KROI3UDVHUPANCNFSM5EW7UH7A>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Do you have a small example bam file that you can post, along with a command that reproduces the problem? |
Commands that I tried are as follows: bam-readcount -w0 -f mm9.fa -d1000000000 aligned_sorted_UNGKO_stim.bam chr12:114663740-114595637 |awk -F ":|\t|=" 'BEGIN {OFS = "\t"}; {print $1, $2, $3 , $4, $21 , $35, $49 , $63}' > BRC_UNGKO_stimtest.txt bam-readcount -w0 -f mm9.fa -d1000000000 aligned_sorted_UNGKO_stim.bam chr12:114663740-114595637 > BRC_UNGKO_stimtest1.txt Following is the link to the bam file: I am using mouse mm9 genome as the reference and the coordinates of interest for deepseq analysis are chr12:114663740-114595637. Best regards, |
Okay, I can reproduce this error. Thanks for the bug report - we'll look into it.
@apldx - can you take a look and see what's up? FWIW, the tmp.fa in there is just mm9 with "chr" prefixes added. I dropped bam and fasta to use on the local filesystem here for debugging purposes: |
Hi
I am running the following code for my deepseq data analysis to extract BAM readcounts. Although my expected coverage is somewhat between 100,000 to 350,000 however, after I run the following code, my output shows very low coverage for deepseq genomic positions which is around 9000 max. Why am I getting this low coverage? I even defined the -d to be 1000000000.
bam-readcount -w0 -f mm9.fa -d1000000000 aligned_sorted_UNGKO_stim.bam |awk -F ":|\t|=" 'BEGIN {OFS = "\t"}; {print $1, $2, $3 , $4, $21 , $35, $49 , $63}' > BRC_UNGKO_stim.txt
The text was updated successfully, but these errors were encountered: