Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupport Unicode #1077

Open
reverofevil opened this issue Nov 27, 2024 · 0 comments
Open

Unsupport Unicode #1077

reverofevil opened this issue Nov 27, 2024 · 0 comments
Assignees
Milestone

Comments

@reverofevil
Copy link

reverofevil commented Nov 27, 2024

Supporting Unicode in Tact grammar with security and IDE support in mind is not the task we want to spend any time on. There is not that many use cases for Unicode in contracts in first place. Worse, JS doesn't have a native UTF-8/16 support, and even ohm.js doesn't correctly handle surrogate pairs.

Goal:

  • ban Unicode characters everywhere in grammar
  • but allow it in strings and comments
  • but even in strings and comments ban all the characters that can change code layout: all line breaks except \n, and all RTL/LTR characters

For reference: there is also the UNICODE SOURCE CODE HANDLING technical standard: https://www.unicode.org/reports/tr55.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants