-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phase #53
Comments
Hi @duhuipeng ,
So in your case, search for Mat_hap2 in the 4th column. That is the locus where it got picked up as hap2.
See https://github.com/marbl/merqury/wiki/3.-Phasing-assessment-with-hap-mers#5-phased-block-statistics-and-switch-error-rates for more details. |
The last column here are the num. kmers found from the haplotype defined as in col. 4. So in your example, 1582 hap2 kmers were found, with no hap1 kmers, and 2234 hap2's, with no hap1s. -Arang |
Dear author If hap2, appears in hap1 and num. switches is 0, then in theory, these hap2 should appear in the second phase, not in the first |
Merqury is defining a phase block based on how many hap1 or hap2 kmers are found. Note the switch here is defined based on a phased block, not the assembly. In a lot of assemblies, especially pseudo-haplotype assemblies, it is hard to make an assumption that a given fasta was assembled from one haplotype. Therefore, Merqury is reporting switch error rate based on what has been defined as a phase block, given how many haplotype-specific kmers are seen. As your assembly seems already completely phased, it may be an interest to report the hamming error rate, which measures the occurrence of the k-mers from the unexpected haplotype to the other in the entire assembly, not the phase blocks. It's easy to calculate that from *.hapmers.count.
So you can grep all lines from the hap1 assembly (given the fasta name in the 1st col.) and count how many switches occurred with SUM(hap2) / SUM(total).
Will give you In most cases though, I feel per-block switch error rate is more important. As shown in your case, there were only two small blocks that were entirely switched. The hamming error rate does not account for 'where' the switches are. It could be more problematic when it is spread all over the assembly versus locally happening in a block. If it's locally happening, as a block, the chance of corrupting a gene from chimeric haplotype joins goes down. |
What you understood is correct.
What you are asking is beyond what Merqury does. Merqury is a tool designed to detect these kinds of switch errors and helps you localize where the error is. |
I got it |
1 similar comment
I got it |
Dear author
I would like to consult where the color is inconsistent in the picture, (the changing contig),What is the generating file that lets me know which contig it is?
I personally feel, looking from the picture, this switch error is somewhat serious
Last question
Why in the generated switches.txt,Show only the switch error results of one phase, where the other phase is to see?
The text was updated successfully, but these errors were encountered: