- Identifiers
-
- 💯: follows UAX-31.
- ✓: allows Unicode in some form.
- ✗: does not allow Unicode.
- Char type
-
- 💯: can represent all emoji (e.g. no distinguishing between characters and strings).
- ✓: can represent all codepoints.
- ✗: cannot represent all codepoints.
Language | Identifiers | Char type |
C# | ✓ L|Nl|'_' (L|Pc|Nd|Nl|Mn|Mc|Cf)* |
✗ (UTF-16 code unit) |
Haskell | ✓ Ll|Lu|Lt (Ll|Lu|Lt|Nd|'_'|''')* |
✓ |
Java | ✓ L|Nl|Sc|Pc (L|Sc|Pc|Nd|Nl|Mn|Mc|Cf|Cc)* |
✗ (UTF-16 code unit) |
Python 3 | 💯 XID_Start XID_Continue* |
💯 |
Swift | ✓ (see link) | ✓ (can represent extended grapheme clusters as well) |