Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in gff #52

Open
mpovidlov1 opened this issue Apr 4, 2022 · 6 comments
Open

Problems in gff #52

mpovidlov1 opened this issue Apr 4, 2022 · 6 comments

Comments

@mpovidlov1
Copy link

I was looking at the gene annotation files, in particular, http://courtyard.gi.ucsc.edu/~mhauknes/T2T/t2t_Y/annotation_set/CHM13.v2.0.gff3
It looks like the file contains multiple problems, mostly touching exons with introns of size 0.
I can send examples

@mpovidlov1
Copy link
Author

@snurk ?

@skoren
Copy link
Member

skoren commented Apr 8, 2022

The annotations come from liftoff/CAT so this is more a question for @mhaukness-ucsc or @diekhans Are these similar to issues asked in #31 and #37?

@mpovidlov1
Copy link
Author

Thanks. The other issues mention other problems with earlier versions of the annotation files. Mine is quite specific. The records define exons like this (start end):
100 200
201 300

which means that the intron between them is of size 0

@mhaukness-ucsc
Copy link

Hi @mpovidlov1, could you please provide some examples? I think this is likely a result of errors present in the original GENCODE annotations, but I'll look into it.

@mpovidlov1
Copy link
Author

Here is an example of the first problematic gene, starts on line 12:

[problems.txt](https://github.com/marbl/CHM13/files/8455851/problems.txt)
 111903112896transcript-
 111903112498exon-
 111940112498CDS-
 111940111942stop_codon-
 112499112896exon-
 112499112877CDS-
 112875112877start_codon-

I have a list of more than 200 problematic genes referenced by line number (attached)

@diekhans
Copy link

Issue moved to CAT repo:
ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit#285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants