-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render our own font stack #801
Conversation
Ligatures are affected by the context.textRendering property on Chrome. We should probably set this regardless of whether it fixes the problem, as leaving it on |
Huge step in the right direction! Well done. The name of Providence in Hindi is प्रोविडेंस. A couple issues here:
Similar issues appear in a variety of South and Southeast Asian scripts. Seems that complex text layout is not being handled. |
Can we determine whether the noted Brahmic script ligature issues are newly introduced or an upstream issue that we need to pursue in other libraries? Perhaps something along the lines of shieldtest could be generated for font scripts. |
This is a longstanding limitation of GL JS. The homegrown text layout engine in GL JS doesn’t have any data or logic around text shaping: maplibre/maplibre-native#778 mapbox/mapbox-gl-native#7774 mapbox/mapbox-gl-native#7528. To the extent the OHM fontstack appears to support Indic text shaping, it’s because of a hack that relies on negative glyph metrics. This is commonplace for Khmer, but for Indic, I think it might explain why the glyphs look like 8-bit and have haphazard spacing. But even it doesn’t address the specific points in #801 (comment). |
Where did you find that? The Wikidata entry for Providence spells it a different way, प्रोविडेंस. |
I took that screen grab off the Sanskrit Wikipedia page for Providence: |
I'll pull the string direct from the database so we're doing a proper comparison. |
I think it depends on how they decided to transliterate the word.. both of them sound correct phonetically. Then again, I am not a native Hindi speaker. |
f985ece
to
26b1c44
Compare
I think you just sold me on trying chatGPT... that almost looks correct. "canvas.createCanvas" was the major glaring error I can see. |
A question for @1ec5 , would sticking to tinysdf( or equivalent ) on both server and client lead to better mixing of CJK and non CJK fonts? Which approach has more/correct information needed for label placement |
In principle, both are capable of font metrics for non-complex text. TinySDF can even handle variable-width (non-CJK) glyphs as of mapbox/tiny-sdf#24, but there are still some limitations, for example mapbox/tiny-sdf#34. Mixing CJK and non-CJK is mostly a nonissue, apart from getting the baselines to match. But I’m not sure the status quo does that well, and anyways a good style would increase the text size of any CJK for legibility. |
Don't get excited, supervising an AI to write code to the point of it working is actually more work than you might imagine 😆 |
With the change in d611a08, this should solve the cross-platform issues. From a strategic perspective, pbf font stacks are a "temporary" technology until maplibre-gl can support a more robust mechanism that can properly support ligatures etc. This is a high priority for the maplibre project: maplibre/maplibre#193 Despte the bounty, it may be some time before a change of this complexity is implemented. In the meantime, we can at least have our broken ligatured scripts at least have a nice font 😬 and work under the expectation that in the future we'll be able to directly link web fonts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, my concern about build times was limited to the overhead of downloading the crates.io index (for a single Rust dependency) and downloading the individual Noto fonts from scratch on each build. It wasn’t specifically about the node-fontnik pipeline; whatever works works, at least until GL JS has a better story around fonts.
I think we’ve solved the crates.io overhead. As for the Noto overhead, it’s only when running the build
script, but not when running the start
script. My suggestion below addresses that issue too.
I added some suggestions of Noto fonts that are analogous to Noto Sans Italic that we could use as a quick fix. However, Google Fonts doesn’t provide analogues in each of the writing systems we’re trying to support. We’d either need to swap in a font from a different source or synthesize an oblique font.
"Gothic A1": ["regular", "700"], | ||
"M PLUS Rounded 1c": ["regular", "700"], | ||
"Noto Sans": ["regular", "700", "italic", "700italic"], | ||
"Noto Sans Arabic": ["regular", "700"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider Noto Naskh Arabic as an analogue to Noto Sans Italic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this font to the download, but whenever I try to use it, the script errors out with a duplicate code point error. I've verified that I don't have any overlapping ranges so I'm not sure at this point how to move it forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code point range at issue is within the [1536, 1791]
range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't even run the Naskh Arabic font through the pipeline by itself. There seems to be something fundamentally wrong with this font that breaks this workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully we aren’t running into that classic Mapbox assumption that there’s a one-to-one correspondence between Unicode codepoints and glyph indices. We could compare this font against Noto Sans Arabic in a font editor and spot-check whether there are any discrepancies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does appear to be the case, though in this instance, the issue is manifesting in the fonteditor-core
library.
"bundle-font-stacks": { | ||
"Noto Sans HK": ["regular", "700"], | ||
"Noto Sans JP": ["regular", "700"], | ||
"Noto Sans KR": ["regular", "700"], | ||
"Noto Sans SC": ["regular", "700"], | ||
"Noto Sans TC": ["regular", "700"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m pretty sure the user will never see these fonts, because GL JS is configured to use the browser’s default “sans-serif” CJK font by default: #613.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included these in order to allow a developer working on Han Unification to have multiple CJK fonts already rendered to work with. They are not bundled into any of the Americana-* fontstacks, they're each rendered out to their own fontstack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, these are by far the largest fonts that would be downloaded when installing the package, but I guess we can live with that as long as it only happens once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A solution regarding Han Unification should probably retain the current client-side font solution rather than switching to server-side fontstacks, which carry significant performance penalties. (Aside from wasteful bandwidth usage, it’s very easy to hog memory and overflow buffers just by looking at East Asia when using server-side CJK fontstacks.)
The client-side font infrastructure is already mostly ready for deunified CJK fonts, except that GL JS only maintains a single instance of TinySDF that it initializes as soon as it needs to render any glyphs, and TinySDF only applies fonts to the canvas once at initialization. I suspect that addressing either limitation would be trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not useful, I can drop these fonts.
I've added the additional scripts that were requested in the code reviews (this necessitated a planet rebuild with new languages added). I believe this satisfies all issues with the exception of whether to also include Han Unification fonts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Piggybacking on @jleedev’s original approval, I think we’ve since made good strides in the build system aspects of this PR, as well as the internationalization aspects. It’s good enough to ship, but let’s ticket out the various larger issues that were raised during review. For example, our support for Indic languages is not as embarrassing as before but still embarrassing – hopefully it’s enough to show promise though!
|
||
In order to add fontstack support, modify the `fonts.json` file as follows: | ||
|
||
1. Add font-family and variant information to the `font-families` section. The font-family is the name of the font as listed on Google Fonts, e.g. "Noto Sans". Use `gfi download "<name of font>"` to get a full list of the variants. The variant is everything to the right of the dash in the filename, so if a file is named `NotoSans-700.ttf`, the variant is `700`, though it will be listed in Google as something like "Bold 700". The `gfi` command requires you to install the `google-font-installer` package into npm with `npm install -g google-font-installer`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it should go here or in CONTRIBUTING.md, but we could document some editorial guidelines to aid in choosing fonts in non-Western writing systems. For example, once we get around to implementing custom CJK fonts, either on the server side via a fontstack or (preferably) on the client side via standard Web fonts, we’d want to pair Western regular sans-serif with gothic and Western regular serif with Ming, and Western italic sans-serif with regular, based on what Western and CJK texts do in analogous situations.
That said, one of the outgrowths of this PR may end up being a separate project specifically about fonts, where we’d naturally explore and elaborate on these ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've captured three tail-work issues below - are there any that I might have missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking Unicode ligatures and combining characters in #827
This PR adds the ability to generate our own fonts and customize which glyphs are serviced by which fonts. Supports 173 languages.
Note that: