ICU-23314 UnicodeSet: extended name escapes#3850
ICU-23314 UnicodeSet: extended name escapes#3850aryanraj45 wants to merge 1 commit intounicode-org:mainfrom
Conversation
| e.getMessage().contains("out of range")); | ||
| } | ||
|
|
||
| // Test that standard \N{name} still works (backward compatibility) |
There was a problem hiding this comment.
As noted on JIRA, this PR is moot, but since I am looking at it: This isn’t for backward compatibility; these just serve different purposes. Sometimes you just want to refer to the character LATIN CAPITAL LETTER A and you don’t actually care what the code point is; the name alone is more readable.
In other cases, you care about the code point (what motivated that is tooling used to develop the Unicode Character Database; for new characters we absolutely want to check that we are putting them in the right place).
Likewise for the hex:literal:name version: sometimes you might want to illustrate what the character is, other times you might not need to (or it might be impractical, e.g., for control characters.
There was a problem hiding this comment.
Thanks for the explanation! I understand now since the parser is being rewritten, my changes would conflict with that work.
I appreciate you taking the time to review and explain the context. I'll close this PR and look for other issues where I can contribute more effectively.
Sorry for not catching this earlier!
Implements support for extended name escapes in UnicodeSet patterns as specified in ICU-23314.
Changes
This PR adds support for the
\N{hex:name}syntax in UnicodeSet patterns, allowing users to specify both the hexadecimal code point and its Unicode name for validation purposes.Implementation Details
applyPropertyPattern()method inUnicodeSet.javato parse the newhex:nameformat\N{...}, the format is parsed ashex:nameIllegalArgumentExceptionis thrown\N{name}syntax continues to workTesting
UnicodeSetTest.TestExtendedNameEscapes()Example Usage
###Checklist