Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax tests for codepoint escaping. #151

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kasei
Copy link
Contributor

@kasei kasei commented Oct 28, 2024

Adds new tests for some interesting cases of unicode codepoint escaping, addressing w3c/sparql-query#164. Two tests (codepoint-esc-01 and codepoint-esc-10) are marked in the manifest with TODO markers as being dependent on decisions on how systems should handle invalid escape sequences. I believe the others are accurately test the existing spec text of SPARQL 1.1.

I think many of these cases should also be turned into evaluation tests, to ensure the unescaping is being performed correctly, but I'll leave that for another PR (or a subsequent update to this PR).

@kasei kasei requested a review from afs October 28, 2024 18:07
@gkellogg
Copy link
Member

If you create the branch for the PR in the rdf-tests repo, the automatic report generation should work properly. It's conceivable that there is a different package that allows pushing the changes to a remote repo, or some filter that would prevent running that action if the repo is not local.

@afs
Copy link
Contributor

afs commented Oct 30, 2024

Turtle handles Unicode escape sequences differently - it has UCHAR in the grammar and it can occur only in strings and URIs. Personally, I think this is a better design - a more common pattern, and it makes it clear what happens when the codepoint itself is meaningful near an escape sequence. I believe this should be "good practice".

The fact that obfuscated queries can be written in SPARQL is not good.
\u0041\u0053\u004B\u0020\u007B\u007D (codepoint-esc-09.rq) (that's ASK {})

And it is bad for streaming (SPARQL Update more than SPARQL Query).

The text 19.2 Codepoint Escape Sequences isn't precise how replacement happens. These are errata that need to be addressed in the spec..

We could split tests into two: "what we want", that is good practice (to be agreed), and "full spec".

Surveying existing systems:

  • Codemirror/YASGUI for SPARQL does not seem to support this.

@gkellogg gkellogg added the SPARQL label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants