Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render our own font stack #801

Merged
merged 30 commits into from
Mar 20, 2023
Merged

Render our own font stack #801

merged 30 commits into from
Mar 20, 2023

Conversation

ZeLonewolf
Copy link
Member

@ZeLonewolf ZeLonewolf commented Feb 19, 2023

This PR adds the ability to generate our own fonts and customize which glyphs are serviced by which fonts. Supports 173 languages.

Note that:

  • CJK glyphs will fall back to system defaults
  • Han Unification:
  • Ligatures still don't seem to render how they should, but this is beyond the scope of this PR and may be an upstream issue.

image

image

ACKNOWLEDGEMENTS.md Outdated Show resolved Hide resolved
scripts/fonts.json Outdated Show resolved Hide resolved
src/layer/water.js Show resolved Hide resolved
scripts/fonts.json Show resolved Hide resolved
@ZeLonewolf ZeLonewolf marked this pull request as ready for review February 19, 2023 12:49
@quincylvania
Copy link
Contributor

Ligatures are affected by the context.textRendering property on Chrome. We should probably set this regardless of whether it fixes the problem, as leaving it on auto could lead to inconsistent results across browsers.

@claysmalley claysmalley linked an issue Feb 19, 2023 that may be closed by this pull request
CONTRIBUTING.md Outdated Show resolved Hide resolved
@claysmalley
Copy link
Member

Huge step in the right direction! Well done.

Screenshot from 2023-02-19 08-36-57

The name of Providence in Hindi is प्रोविडेंस. A couple issues here:

  • Ligatures are not forming. प्र (U+092A, U+094D, U+0930) is showing up in its non-combined form, प्‌र.
  • Combining characters are appearing in the wrong visual order. व (U+0935) plus the vowel "i" forms वि (U+0935, U+093F). The combining vowel U+093F is joining with the following character instead.

Similar issues appear in a variety of South and Southeast Asian scripts. Seems that complex text layout is not being handled.

@ZeLonewolf
Copy link
Member Author

Can we determine whether the noted Brahmic script ligature issues are newly introduced or an upstream issue that we need to pursue in other libraries?

Perhaps something along the lines of shieldtest could be generated for font scripts.

@ZeLonewolf
Copy link
Member Author

Looks like newly introduced:
Screenshot_20230219_121734_Chrome

@1ec5
Copy link
Member

1ec5 commented Feb 19, 2023

Can we determine whether the noted Brahmic script ligature issues are newly introduced or an upstream issue that we need to pursue in other libraries?

This is a longstanding limitation of GL JS. The homegrown text layout engine in GL JS doesn’t have any data or logic around text shaping: maplibre/maplibre-native#778 mapbox/mapbox-gl-native#7774 mapbox/mapbox-gl-native#7528.

To the extent the OHM fontstack appears to support Indic text shaping, it’s because of a hack that relies on negative glyph metrics. This is commonplace for Khmer, but for Indic, I think it might explain why the glyphs look like 8-bit and have haphazard spacing. But even it doesn’t address the specific points in #801 (comment).

@ZeLonewolf
Copy link
Member Author

Okay, so the expected script is

image

Therefore the old way is still wrong.

@claysmalley
Copy link
Member

Where did you find that? The Wikidata entry for Providence spells it a different way, प्रोविडेंस.

@ZeLonewolf
Copy link
Member Author

@ZeLonewolf
Copy link
Member Author

I'll pull the string direct from the database so we're doing a proper comparison.

@ZeLonewolf
Copy link
Member Author

Database:
image

Noto Sans Devanagari:
image

Fontspace.com unicode inspector:
image

@ramSeraph
Copy link

Where did you find that? The Wikidata entry for Providence spells it a different way, प्रोविडेंस.

I think it depends on how they decided to transliterate the word.. both of them sound correct phonetically. Then again, I am not a native Hindi speaker.

@ZeLonewolf
Copy link
Member Author

CONTRIBUTING.md Outdated Show resolved Hide resolved
@ZeLonewolf ZeLonewolf requested a review from 1ec5 February 27, 2023 21:25
@ramSeraph
Copy link

Sample code below courtesy ChatGPT.

Is this what you had in mind?

const canvas = require('canvas');
const tinySDF = require('tiny-sdf');
const fontnik = require('fontnik');
const fs = require('fs');

// Load the TTF font file using node-canvas
const font = new canvas.Font('MyFont', './MyFont.ttf');

// Set up the tiny-sdf options
const options = {
  size: 24, // size of each glyph
  buffer: 3, // amount of padding around each glyph
  radius: 8, // radius of the SDF
  cutoff: 0.25, // cutoff value for the SDF
  fontFamily: 'MyFont', // name of the font family
};

// Generate signed distance field glyphs for the font using tiny-sdf
const glyphs = tinySDF(options);

// Create a new canvas using node-canvas
const canvas = canvas.createCanvas(options.size, options.size);

// Loop through each glyph and draw it onto the canvas
for (const glyph of glyphs) {
  // Clear the canvas
  canvas.clearRect(0, 0, options.size, options.size);

  // Set the font and fill style
  canvas.font = `${options.size}px ${options.fontFamily}`;
  canvas.fillStyle = '#fff';

  // Draw the glyph onto the canvas
  canvas.fillText(glyph.char, glyph.x, glyph.y);

  // Export the canvas as a PNG image
  const buffer = canvas.toBuffer();

  // Convert the PNG image to a .pbf fontstack using fontnik
  const fontstack = fontnik.fromPng(buffer);

  // Save the .pbf fontstack file to disk
  fs.writeFileSync(`${glyph.char}.pbf`, fontstack);
}

I think you just sold me on trying chatGPT... that almost looks correct. "canvas.createCanvas" was the major glaring error I can see.

@ramSeraph
Copy link

ramSeraph commented Mar 15, 2023

A question for @1ec5 , would sticking to tinysdf( or equivalent ) on both server and client lead to better mixing of CJK and non CJK fonts? Which approach has more/correct information needed for label placement tinysdf or node-fontnik?

@jleedev jleedev self-requested a review March 15, 2023 22:34
@jleedev jleedev dismissed their stale review March 15, 2023 22:34

Consensus seems against merging this

@1ec5
Copy link
Member

1ec5 commented Mar 15, 2023

would sticking to tinysdf( or equivalent ) on both server and client lead to better mixing of CJK and non CJK fonts? Which approach has more/correct information needed for label placement tinysdf or node-fontnik?

In principle, both are capable of font metrics for non-complex text. TinySDF can even handle variable-width (non-CJK) glyphs as of mapbox/tiny-sdf#24, but there are still some limitations, for example mapbox/tiny-sdf#34. Mixing CJK and non-CJK is mostly a nonissue, apart from getting the baselines to match. But I’m not sure the status quo does that well, and anyways a good style would increase the text size of any CJK for legibility.

@ZeLonewolf
Copy link
Member Author

I think you just sold me on trying chatGPT... that almost looks correct.

Don't get excited, supervising an AI to write code to the point of it working is actually more work than you might imagine 😆

@ZeLonewolf
Copy link
Member Author

ZeLonewolf commented Mar 16, 2023

With the change in d611a08, this should solve the cross-platform issues.

From a strategic perspective, pbf font stacks are a "temporary" technology until maplibre-gl can support a more robust mechanism that can properly support ligatures etc. This is a high priority for the maplibre project: maplibre/maplibre#193

Despte the bounty, it may be some time before a change of this complexity is implemented. In the meantime, we can at least have our broken ligatured scripts at least have a nice font 😬 and work under the expectation that in the future we'll be able to directly link web fonts.

Copy link
Member

@1ec5 1ec5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, my concern about build times was limited to the overhead of downloading the crates.io index (for a single Rust dependency) and downloading the individual Noto fonts from scratch on each build. It wasn’t specifically about the node-fontnik pipeline; whatever works works, at least until GL JS has a better story around fonts.

I think we’ve solved the crates.io overhead. As for the Noto overhead, it’s only when running the build script, but not when running the start script. My suggestion below addresses that issue too.

I added some suggestions of Noto fonts that are analogous to Noto Sans Italic that we could use as a quick fix. However, Google Fonts doesn’t provide analogues in each of the writing systems we’re trying to support. We’d either need to swap in a font from a different source or synthesize an oblique font.

scripts/fonts.json Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
scripts/fonts.json Show resolved Hide resolved
"Gothic A1": ["regular", "700"],
"M PLUS Rounded 1c": ["regular", "700"],
"Noto Sans": ["regular", "700", "italic", "700italic"],
"Noto Sans Arabic": ["regular", "700"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider Noto Naskh Arabic as an analogue to Noto Sans Italic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this font to the download, but whenever I try to use it, the script errors out with a duplicate code point error. I've verified that I don't have any overlapping ranges so I'm not sure at this point how to move it forward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code point range at issue is within the [1536, 1791] range.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't even run the Naskh Arabic font through the pipeline by itself. There seems to be something fundamentally wrong with this font that breaks this workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully we aren’t running into that classic Mapbox assumption that there’s a one-to-one correspondence between Unicode codepoints and glyph indices. We could compare this font against Noto Sans Arabic in a font editor and spot-check whether there are any discrepancies.

/ref maplibre/maplibre-native#778 (reply in thread)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does appear to be the case, though in this instance, the issue is manifesting in the fonteditor-core library.

Comment on lines +24 to +30
"bundle-font-stacks": {
"Noto Sans HK": ["regular", "700"],
"Noto Sans JP": ["regular", "700"],
"Noto Sans KR": ["regular", "700"],
"Noto Sans SC": ["regular", "700"],
"Noto Sans TC": ["regular", "700"]
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m pretty sure the user will never see these fonts, because GL JS is configured to use the browser’s default “sans-serif” CJK font by default: #613.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included these in order to allow a developer working on Han Unification to have multiple CJK fonts already rendered to work with. They are not bundled into any of the Americana-* fontstacks, they're each rendered out to their own fontstack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, these are by far the largest fonts that would be downloaded when installing the package, but I guess we can live with that as long as it only happens once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A solution regarding Han Unification should probably retain the current client-side font solution rather than switching to server-side fontstacks, which carry significant performance penalties. (Aside from wasteful bandwidth usage, it’s very easy to hog memory and overflow buffers just by looking at East Asia when using server-side CJK fontstacks.)

The client-side font infrastructure is already mostly ready for deunified CJK fonts, except that GL JS only maintains a single instance of TinySDF that it initializes as soon as it needs to render any glyphs, and TinySDF only applies fonts to the canvas once at initialization. I suspect that addressing either limitation would be trivial.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not useful, I can drop these fonts.

@ZeLonewolf
Copy link
Member Author

I've added the additional scripts that were requested in the code reviews (this necessitated a planet rebuild with new languages added). I believe this satisfies all issues with the exception of whether to also include Han Unification fonts.

@ZeLonewolf ZeLonewolf requested a review from 1ec5 March 19, 2023 21:36
Copy link
Member

@1ec5 1ec5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piggybacking on @jleedev’s original approval, I think we’ve since made good strides in the build system aspects of this PR, as well as the internationalization aspects. It’s good enough to ship, but let’s ticket out the various larger issues that were raised during review. For example, our support for Indic languages is not as embarrassing as before but still embarrassing – hopefully it’s enough to show promise though!


In order to add fontstack support, modify the `fonts.json` file as follows:

1. Add font-family and variant information to the `font-families` section. The font-family is the name of the font as listed on Google Fonts, e.g. "Noto Sans". Use `gfi download "<name of font>"` to get a full list of the variants. The variant is everything to the right of the dash in the filename, so if a file is named `NotoSans-700.ttf`, the variant is `700`, though it will be listed in Google as something like "Bold 700". The `gfi` command requires you to install the `google-font-installer` package into npm with `npm install -g google-font-installer`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it should go here or in CONTRIBUTING.md, but we could document some editorial guidelines to aid in choosing fonts in non-Western writing systems. For example, once we get around to implementing custom CJK fonts, either on the server side via a fontstack or (preferably) on the client side via standard Web fonts, we’d want to pair Western regular sans-serif with gothic and Western regular serif with Ming, and Western italic sans-serif with regular, based on what Western and CJK texts do in analogous situations.

That said, one of the outgrowths of this PR may end up being a separate project specifically about fonts, where we’d naturally explore and elaborate on these ideas.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've captured three tail-work issues below - are there any that I might have missed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking Unicode ligatures and combining characters in #827

src/layer/water.js Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenHistoricalMap font stack doesn't support comma-separated lists
6 participants