Identify linked mutations from mapped reads #357

spleonard1 · 2023-09-08T23:35:18Z

I'm butchering breseq's intended use case and identifying gene mutants that arose during high throughput gene variant synthesis and tracking their abundance over a short, selective time course (< 48 hours). In many cases there are multiple sets of linked mutations, which can clearly be seen from the read mapping evidence.

Is it possible to identify which mutations occur together on a single read? Does breseq keep track of which unique reads support particular mutation calls? Right now I am using some frequency correlations to loosely link mutations, but it would be nice to parse which reads support which mutations to confidently link them.

I have attached a couple representative pictures. Not a bug, just a discussion / feature request. Thanks!

jeffreybarrick · 2023-09-09T14:22:07Z

breseq does not track linkage of mutations by read—not even in simple cases where there are base substitutions side-by-side (which is annoying). I can imagine a post-processing step that could go back and do this, at least in simple clear-cut cases like this.

If someone wanted to add this to breseq, they could pilot the step by making a program parse the output reference.bam file and look at the read alignment columns referred to by the RA evidence items that are within one read length of one another and counting how many times mutations are and are not within the same read. There could be some new field in the output GD file like "haplotype=XXXX" that could be used to group linked mutations.

Since this is unlikely to happen in the near future, maybe you could look into haplotype reconstruction programs used for virus genomes (and mixtures of those) to see if any of them can give you this kind of output?

spleonard1 · 2023-09-09T16:12:54Z

Oooh that’s a good idea re virus haplotyping approaches, thanks!

jeffreybarrick added the feature-request label Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify linked mutations from mapped reads #357

Identify linked mutations from mapped reads #357

spleonard1 commented Sep 8, 2023

jeffreybarrick commented Sep 9, 2023

spleonard1 commented Sep 9, 2023

Identify linked mutations from mapped reads #357

Identify linked mutations from mapped reads #357

Comments

spleonard1 commented Sep 8, 2023

jeffreybarrick commented Sep 9, 2023

spleonard1 commented Sep 9, 2023