Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: support deduplication based on [contig]:[R1 coordinate] only #22

Open
ijhoskins opened this issue Jul 8, 2020 · 3 comments

Comments

@ijhoskins
Copy link

I am glad to see this tool for generating consensus reads. Unfortunately it does not work with some types of data. For example, where library prep chemistry allows for multiple read pairs to be read off the same fragment with differing template lengths (differing R1-R2 spans). One example of such chemistry is Anchored Multiplex PCR: https://www.nature.com/articles/nm.3729

To do this you would cluster on the start position of the R1 but not include the right_pos of the pair. Is this something that could be easily supported via a command line flag?

@TomaszSuchan
Copy link

I upvote this. If I understand correctly, it would make gencore behave the same as Picard for single end reads

@SPPearce
Copy link

I would also be very interested in this, my data has slightly different template lengths, so I would like to make consensus reads with respect to just the UMI and R1.

@mel9320107
Copy link

I agree, this would be great. I'm working with PacBio sequencing of a short sequence with barcoded (UMI) variants and think this tool would be great to create a consensus sequence of the variant for each barcode to create a barcode-variant look up table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants