Skip to content

Conversation

@zamderax
Copy link
Contributor

@zamderax zamderax commented Dec 25, 2025

Emojis were treated as width of 1 but now they should be treated as width of 2. This incorrectly would shift a row in a table off by one space.

@zamderax zamderax requested a review from a team as a code owner December 25, 2025 20:27
@zamderax zamderax requested review from cschmatzler and fortmarek and removed request for a team December 25, 2025 20:27
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Dec 25, 2025
@zamderax zamderax changed the title Fix emoji width detection for table alignment fix: correct emoji width for table alignment Dec 25, 2025
///
/// There is no standard for this, but it seems like most terminals treat
/// emojis and ideographs as double width.
public var displayWidth: Int {
Copy link
Contributor

@pepicrft pepicrft Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to adjust the implementation to use wcwidth, which is a POSIX C function that returns the number of columns a character occupies in a terminal. Terminal, shells, and CLI tools like ls & vim use it:

#if canImport(Darwin)
import Darwin
#elseif canImport(Glibc)
import Glibc
#endif

extension Character {
    public var displayWidth: Int {
        unicodeScalars.reduce(0) { total, scalar in
            let w = wcwidth(wchar_t(scalar.value))
            // wcwidth returns -1 for non-printable, treat as 0
            return total + max(0, Int(w))
        }
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the problem I ran into with wcwidth

  1. Emoji sequences - Characters like 👨‍👩‍👦 are actually multiple Unicode scalars joined by Zero-Width Joiners (ZWJ):
    👨 + ZWJ + 👩 + ZWJ + 👦 = 👨‍👩‍👦
  2. wcwidth sees each piece separately and sums them (~8), but terminals render it as one 2-column character.
  3. Variation selectors - ✓ (U+2713) vs ✓️ (U+2713 + U+FE0F). The second has a variation selector that tells the terminal "render this as emoji" (width 2), but wcwidth ignores it.
  4. I put a flag emoji - 🇺🇸 is two "regional indicator" characters. wcwidth returns -1 (unknown) for each, but terminals show a single 2-column flag.

What do you think we should do?

Copy link
Contributor

@pepicrft pepicrft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants