Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI validation: error with comma and semicolon #7

Open
knit-bee opened this issue Feb 1, 2023 · 5 comments
Open

URI validation: error with comma and semicolon #7

knit-bee opened this issue Feb 1, 2023 · 5 comments

Comments

@knit-bee
Copy link

knit-bee commented Feb 1, 2023

Hi @hartwork ,
I noticed that rnv doesn't validate URIs that contain a comma or semicolon.
(This has been reported as a bug on the sourceforge page before.)
I was wondering if there is a chance that this behavior will be fixed in the (near) future?

Cheers, Luise

@hartwork
Copy link
Owner

hartwork commented Feb 1, 2023

Hi @knit-bee! Proper URI validation a la RFC 3986 takes something like https://github.com/uriparser/uriparser . While I happen to be the main developer of uriparser, my role on rnv was mostly build system fixes, I have other duties, and rnv is in maintenance mode without truly active development already since roughly 2006. I see these ways forward:

  • a) Someone provides a clean and well-tested pull request integrating uriparser,
  • b) Someone provides sponsorship of this feature that ends up in clean and well-tested pull request integrating uriparser, or
  • c) Users replace rnv by something that is not in maintenance mode but actively developed.

Best, Sebastian

PS: My GitHub profile has my e-mail contact for anything that would not fit a public reply on GitHub.

@knit-bee
Copy link
Author

knit-bee commented Feb 2, 2023

Hi @hartwork ,
thank you for the quick reply and for clarifying the situation!

Your forward proposals all seem reasonable to me. Unfortunately, I'm not proficient in C, otherwise I would try to integrate uriparser myself.

Maybe we can keep this issue open in the hope that someone else stumbles upon it who has the time and skill to tackle it.

Best, Luise

@hartwork
Copy link
Owner

hartwork commented Feb 2, 2023

@knit-bee sure, let's keep it open. Is my understanding correct that this is about false negatives, i.e. you have a well-formed URI that contains a comma/semicolon and it should be tolerated but it is instead rejected as invalid?

@knit-bee
Copy link
Author

knit-bee commented Feb 3, 2023

Yes, exactly!

@ulf1
Copy link

ulf1 commented Apr 18, 2023

is this the broken regex pattern?

rnv/xsd.c

Line 298 in 2a7236d

#define PAT_ANY_URI "(([a-zA-Z][0-9a-zA-Z+\\-\\.]*:)?/{0,2}[0-9a-zA-Z;/?:@&=+$\\.\\-_!~*'()%]+)?(#[0-9a-zA-Z;/?:@&=+$\\.\\-_!~*'()%]+)?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants