Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C/J vertical metrics requirements #8911

Open
aaronbell opened this issue Jan 14, 2025 · 17 comments
Open

C/J vertical metrics requirements #8911

aaronbell opened this issue Jan 14, 2025 · 17 comments

Comments

@aaronbell
Copy link
Collaborator

aaronbell commented Jan 14, 2025

Over the years, Google has established clear guidance for setting font vertical metrics to ensure consistent rendering across a variety of environments (with web as a particular focus).

As a short summary, this guidance makes use of the two core vertical metric values, sTypo (sTypoAscender / sTypoDescender / sTypoLineGap) to establish typographic metrics (which are set for ideal line height for reading) in conjunction with usWin (usWinAscent / usWinDescent) to establish clipping metrics (where MS Word will not display any glyph paint that extends beyond these metrics), and setting the USE_TYPO_METRICS flag to ensure that they are used for such. The hhea metrics follow the sTypo (as the USE_TYPO_METRICS flag is set) values for consistency on Mac. If you need further explanation of these metrics and how to set them, I suggest reading the spec at the link above.

Interestingly, where the metrics requirements are responsive to the design of the typeface, in the case of CJK they are locked to specific values based on the metrics established by Ken Lunde for the Source Han Sans / Noto CJK project:

Attrib Value Example using 1000upm font
OS/2.sTypoAscender 0.88 * font upm 880
OS/2.sTypoDescender -0.12 * font upm -120
OS/2.sTypoLineGap 0 0
hhea.ascender Set to look comfortable (~1.16 * upm) 1160
hhea.descender Set to look comfortable (~0.288 * upm) -288
hhea.lineGap 0 0
OS/2.usWinAscent Same as hhea.ascent 1160
OS/2.usWinDescent abs(value) of hhea.descent 288
OS/2.fsSelection bit 7 (Use_Typo_Metrics) Do not set / disabled 0

In this case, since the USE_TYPO_METRICS flag is disabled, the usWin metrics act as both typographic vertical metrics and also clipping metrics.

The setting for the sTypo values is of particular note, and comes from the MS OpenType Spec requirements for the sTypo values of CJK fonts.

For CJK (Chinese, Japanese, and Korean) fonts that are intended to be used for vertical (as well as horizontal) layout, the required value for sTypoAscender is that which describes the top of the ideographic em-box. For example, if the ideographic em-box of the font extends from coordinates 0,-120 to 1000,880 (that is, a 1000 × 1000 box set 120 design units below the Latin baseline), then the value of sTypoAscender must be set to 880. Failing to adhere to these requirements will result in incorrect vertical layout.

For CJK (Chinese, Japanese, and Korean) fonts that are intended to be used for vertical (as well as horizontal) layout, the required value for sTypoDescender is that which describes the bottom of the ideographic em-box. For example, if the ideographic em-box of the font extends from coordinates 0,-120 to 1000,880 (that is, a 1000 × 1000 box set 120 design units below the Latin baseline), then the value of sTypoDescender must be set to -120. Failing to adhere to these requirements will result in incorrect vertical layout.

There are several problems with this approach.

  1. It is specific to Source Han Sans. As NightFurySL2001 points out, the positioning and ratios of the ideographic-em-box in relation to Latin glyphs can differ from font to font and foundry to foundry. So the ratio used for establishing the position of the box should have flexibility to adjust: 0.88:0.12, 0.85:0.15, 0.80:0.20. As long as it still sums to the ideographic-em-box, then any specific ratio should be fine.
  2. usWin metrics are not reflective of what is in the font. Like (1), the design of a given font can differ quite significantly, so by setting the usWin metrics to specific values instead of being responsive to the design, the metric might be too small to avoid clipping, or excessively large for the design.
  3. As USE_TYPO_METRICS is disabled, then usWin metrics are used for default line-spacing when displaying the font on the browser, and in applications like MS Word. And since usWin metrics are primarily a clipping metric, then in most cases the line-gap will likely be too big for a given design. There are even complaints about this for Source Han Sans / Noto Sans CJK.

The BASE table
Stepping back from the guidance in the MS OT Spec, Ken Lunde, and Google’s specification, these requirements raise some key questions:

  • If USE_TYPO_METRICS is disabled, why does setting sTypo metrics matter?
  • Why can’t we use sTypo/USE_TYPO_METRICS in the same way as in non-CJK fonts to establish ideal default line spacing?

On this subject the OT Spec directs readers to Baseline tags and to the BASE table. The BASE table is part of the OpenType code of a font that provides information regarding the ideographic-em-box. Below is a sample of the ideographic-em-box and the corresponding BASE table.

Image
table BASE {
  HorizAxis.BaseTagList     icfb  icft  ideo  romn;
  HorizAxis.BaseScriptList  
                DFLT 	ideo  	-75   835  -120     0,
                hani  	ideo   	-75   835  -120     0,
                kana  	ideo   	-75   835  -120     0,
                latn  	romn   	-75   835  -120     0,
                cyrl  	romn   	-75   835  -120     0,
                grek  	romn   	-75   835  -120     0;

  VertAxis.BaseTagList      icfb  icft  ideo  romn;
  VertAxis.BaseScriptList 
                DFLT  	ideo    	45   955     0   120,
                hani  	ideo    	45   955     0   120,
                kana  	ideo    	45   955     0   120,
                latn  	romn    	45   955     0   120,
                cyrl  	romn    	45   955     0   120,
                grek  	romn    	45   955     0   120;
} BASE;

The BASE tags above are:

  • icfb: Ideographic character face bottom edge (in HorizAxis) / left edge (in VertAxis)
  • icft: Ideographic character face top edge (in HorizAxis) / right edge (in VertAxis)
  • ideo: ideographic em-box bottom edge (in HorizAxis) / left edge (in VertAxis)
  • romn: Latin baseline. Usually 0 in HorizAxis, inverse of ideo in VertAxis

These BASE tags provide language-specific metrics data that may be used for typesetting purposes, such as to enable better cross-script alignment. However, in cases where a font does not include a BASE table and an application needs to define the ideographic-em-box for rendering purposes, there is specific logic laid out in the OT spec wherein the typoMetrics are used as a fallback:

ideoEmboxLeft = 0
If HorizAxis.ideo is defined:
	ideoEmboxBottom = HorizAxis.ideo
	If HorizAxis.idtp is defined:
		ideoEmboxTop = HorizAxis.idtp
	Else:
		ideoEmboxTop = HorizAxis.ideo + head.unitsPerEm
	If VertAxis.idtp is defined:
		ideoEmboxRight = VertAxis.idtp
	Else:
		ideoEmboxRight = head.unitsPerEm
	If VertAxis.ideo is defined and is non-zero:
		Warning: Bad VertAxis.ideo value
	Else If this is a CJK font:
		ideoEmboxBottom = OS/2.sTypoDescender
		ideoEmboxTop = OS/2.sTypoAscender
		ideoEmboxRight = head.unitsPerEm
	Else:
		ideoEmbox cannot be determined for this font

Because of this fallback logic, the OT Spec recommends that thesTypo and hhea metrics align with the BASE table to ensure consistency:

CJK fonts generally should have the same descender value recorded in hhea.descender, OS/2.sTypoDescender, and HorizAxis.ideo (if present) fields, and the same ascender value recorded in hhea.ascender, OS/2.sTypoAscender, and HorizAxis.idtp (if present) fields.

A new direction forward for C/J vertical metrics
This appropriation of thesTypo leaves us in a quandary. In order to ensure compatibility with “some” applications and legacy environments, it is required to keep the sTypo aligned with the ideographic em-box. However, doing so removes our capability to set the vertical metrics apart from the clipping metrics.

Interestingly, the OT spec seems to predict this predicament:

The OS/2.sTypoDescender and OS/2.sTypoAscender fields in a CJK font may specify metrics different from the HorizAxis.ideo and HorizAxis.idtp values in the BASE table.

To that end, I would like to propose a new approach for C/J fonts. For the sTypo, we use the ideographic em-box as a reference guide, and scale it up proportionally, similar to how Latin treats the ascender / descender values. hhea will follow the sTypo, and usWin will continue to align with yMin and yMax. Finally, an accurately-set BASE table will be required to enable applications that need ideographic information to position the glyphs correctly.

Taking LXGW WenKai TC as an example:

Attrib Value Example using WenKai
OS/2.sTypoAscender ideoEmboxTop + (15–20% * emBox)/2 852+(0.18*1000)/2 = 942
OS/2.sTypoDescender ideoEmboxBottom - (15–20% * emBox)/2 -148-(0.18*1000)/2  = -238
OS/2.sTypoLineGap 0 0
hhea.ascender OS/2.sTypoAscender 942
hhea.descender OS/2.sTypoDescender -238
hhea.lineGap 0 0
OS/2.usWinAscent abs(value) of yMax 1102
OS/2.usWinDescent abs(value) of yMin 285
OS/2.fsSelection bit 7 (Use_Typo_Metrics) Enabled
OT BASE table Required

(note – in this case, I used a 18% increase on the sTypoMetrics to follow the suggestion of the original designer)

Image
table BASE {
  HorizAxis.BaseTagList     icfb  icft  ideo  romn;
  HorizAxis.BaseScriptList  
                DFLT 	ideo  	-96   800  -148     0,
                hani  	ideo   	-96   800  -148     0,
                kana  	ideo   	-96   800  -148     0,
                latn  	romn   	-96   800  -148     0,
                cyrl  	romn   	-96   800  -148     0,
                grek  	romn   	-96   800  -148     0;

  VertAxis.BaseTagList      icfb  icft  ideo  romn;
  VertAxis.BaseScriptList 
                DFLT  	ideo    	28   974     0   148,
                hani  	ideo    	28   974     0   148,
                kana  	ideo    	28   974     0   148,
                latn  	romn   	28   974     0   148,
                cyrl  	romn    	28   974     0   148,
                grek  	romn    	28   974     0   148;
} BASE;

The result of this approach, in MS Word for Mac:

Image

As you can see, this change has produced significantly less space between the lines as the overall line height (ascender+descender) has reduced from 1317 to 1180. It helps the text hold together better, and read more comfortably than previously.

Open Questions

  • It has been reported by NightFury that in cases where the vhea/vmtx table are not present, that the sTypo values may used as a fallback when setting text vertically. It would be worth investigating if this use is widespread, and if so, then addition of these vertical-specific tables should be recommended for C/J fonts, and required for any that are intended for vertical use.
  • This document specifically discusses C/J fonts, but not Korean as Hangeul can / should be treated differently than ideographic-heavy scripts. It will be covered by a separate document.

Risks

  1. Applications that use the sTypo metrics for ideographic-based positioning will find that any font employing the above method are no longer positioned as expected. I believe this is primarily a legacy issue, but worth noting.
  2. If the proposal is applied to all C/J fonts across the library, then there will be backwards compatibility problems. One mechanism to mitigate this risk is to only apply it moving forwards.

Priority
This is a high priority issue that needs resolution as it is currently preventing immediate onboarding for many Traditional Chinese fonts:

And will prevent onboarding of these upcoming Chinese projects as well

Additionally, we have two upcoming Japanese and three Korean projects which will also be impacted.

@davelab6
Copy link
Member

@tiroj you might be interested in this :)

@celestialphineas
Copy link

celestialphineas commented Jan 15, 2025

Regarding Risk 1, I would like to mention that:

The majority of SC/TC vendors never write a BASE table. Taking advantage of OS/2.sTypoDescender as the bottom line of the ideographic em-box is a very recent practice in the industry, and is now well-received by SC/TC vendors. I believe this risk is a real threat, and could break everything, impacting new solutions that handle typefaces from early-developed to more recent SC/TC products, as well as legacy code handling the proposed solution.

To my knowledge, the nowadays common practice of foundries in China mainland is as follows:

  • No BASE table appears.
  • sTypo metrics are set for the ideographic em-box. Usually 850/-150, as this is the default valued inferred by Adobe applications, when a CJK typeface contains no BASE table. Fonts without a BASE table and at the same time with sTypo values not set to 850/-150 will not fit the layout grid.
  • The Latin baseline does not necessarily lie on the y=0 baseline. Instead, Latin letters may float around the baseline to match the Han characters visually, rather than being positioned as technicians usually expect. In other words, the real Latin baseline information is usually lost.

@rutopio
Copy link

rutopio commented Jan 15, 2025

Our usual approach:

  • No BASE table
  • Use sTypo and hhea.ascender/descender to define the metrics.
  • Use 880/120 ratio (not retroactively applied to older designs or fonts modified from other open-source fonts).
  • Set sTypoLinegap = hheaLinegap = 0.
  • Set usWinDescent = 150.
    • Based on some tests we conducted before, if usWinDescent exceeds 150, some applications on Windows platform (such as Office 365) will default to 200% line height.
    • Since Windows and Office are frequently updated, I’m not sure if this issue has been resolved. Perhaps more user feedback is needed.
  • The baseline of Latin letters does not necessarily align with y = 0. Instead, to match the visual balance of CJK and Latin characters, and ensure compatibility with other fonts from our foundry for mixed typesetting, it is usually determined by the designers.

@NightFurySL2001
Copy link
Contributor

NightFurySL2001 commented Jan 15, 2025

  1. From @ButTaiwan, Japanese fonts uses 880/-120 for the sTypo values across all fonts to ensure em-box position when changing fonts inline; the Latin baseline is not fixed similar to what @celestialphineas mentioned for Chinese fonts. It can further be deduced it is common industrial practice in C/J to use the em-box for the sTypo values.
  2. Currently only Adobe seems to be using BASE table at all. In the case where BASE is missing, it seems either the sTypo values or a fixed 850/-150 value (conficting information I received) is used in InDesign instead for em-box alignment.
  3. Even if historic software are ignored, there are still issues with the proposed approach. The proposed setting will almost always be harder to use than regular values, and clipping issues might occur in normal text. More explained below.

usWin metrics are not reflective of what is in the font. Like (1), the design of a given font can differ quite significantly, so by setting the usWin metrics to specific values instead of being responsive to the design, the metric might be too small to avoid clipping, or excessively large for the design.

The original intent for the usWin setting for this approach is actually mentioned in my proposal to update the metrics in googlefonts/googlefonts.github.io#135 and not what is listed in the current table, which is to take the highest point excluding extra-tall characters for vertical typesetting and harmonised across weight. The proposed setting is to just take the yMin/yMax for usWin, which for newer C/J fonts will almost always mean stuck at around 1300/-600 due to the presence of either U+3031 〱/ U+3032 〲/ vertical glyph for U+2E3A ——. These glyphs should not be considered when calculating the horizontal clipping metrics for usWin.

For the sTypo, we use the ideographic em-box as a reference guide, and scale it up proportionally, similar to how Latin treats the ascender / descender values.

Although a viable solution, Latin vertical metrics actually considered the possibility of extra-height glyphs in font:

  1. typo/hheaAscender value should leave open room for stacked diacritics.

Following recent tests to confirm our vertical metrics policies, we now require the (typo/hhea)Ascender to be equal to Abreveacute U+1EAE. For families with multiple weights, you must use the tallest Abreveacute U+1EAE (e.g., in the Black master) as the reference point to guarantee uniform positioning across the entire font family.
Quoted from https://googlefonts.github.io/gf-guide/metrics.html

However, the new proposal for C/J fonts here did not take into account of the possibility of stacked diacritics in C/J fonts. This has actually been considered in the original metrics, quote from my proposal to update metrics:

hhea values should be set to look harmonized (excluding extreme vertical height symbols only used in vertical typesetting such as U+3031 and U+3032), including Latin parts (such as hanyu pinyin Ǚ).

It's just mistakenly reworded to be "Set to look comfortable" in the current requirements. The extra-tall looking metrics show in example for Source Han Sans is actually due to another rule that is present in Latin metric requirements too:

  1. Vertical metrics must be consistent across a family.

Each font in a family must share the same vertical metrics values.
Quoted from https://googlefonts.github.io/gf-guide/metrics.html

In Source Han Sans' case, that is mostly due to stacked diacritics for hanyu pinyin (Ǜ) and Vietnamese (Ẫ) in Heavy weight. This should probably be considered in C/J fonts too.

@NightFurySL2001
Copy link
Contributor

NightFurySL2001 commented Jan 15, 2025

TL;DR: The current metrics seems to go against the current C/J font industry convention, which will cause issues in the long run especially incompatibility with commercial C/J fonts. A bettter vertical metrics wording/clarification update for C/J (instead of a full reworking) is proposed in googlefonts/googlefonts.github.io#135 .

@aaronbell
Copy link
Collaborator Author

@celestialphineas @rutopio Thank you for sharing the approaches that you are familiar with. Interestingly, neither aligns with the existing GF requirements, only further serving to underscore the need to update these requirements :).

Out of interest, are either of you familiar with any apps (aside from Adobe) that make use of the sTypo metrics to decide vertical positioning? Aside from "it is convention", no one has been able to point me to specific examples. This drives me to think that it is worth having a conversation regarding if the convention continues to make sense. 😄

Based on some tests we conducted before, if usWinDescent exceeds 150, some applications on Windows platform (such as Office 365) will default to 200% line height.

Interesting. I will reach out to my contacts over in Office and find out how this is set.

@NightFurySL2001

The proposed setting is to just take the yMin/yMax for usWin, which for newer C/J fonts will almost always mean stuck at around 1300/-600 due to the presence of either U+3031 〱/ U+3032 〲/ vertical glyph for U+2E3A ——. These glyphs should not be considered when calculating the horizontal clipping metrics for usWin.

We can certainly provide further clarification on this point. The Latin guidelines use a specific glyph (Ắ) and state that the sTypo values should be set at least above that point, if not higher, and usWin clipping is at yMax (since sTypo will be determining line heights, the specific usWin value just needs to cover everything).

I'm fine with implementing something similar here with an appropriate corresponding glyph which we can discuss. My thought was to use the em-box as a guideline since the ideographs are prioritized in these projects.

However, the new proposal for C/J fonts here did not take into account of the possibility of stacked diacritics in C/J fonts.

The method I suggested was attempting to optimize more for keeping the ideographs centered on the body—so it is based on the existing em-square rather than following the guidance for Latin, which is optimizing for ensuring that there will not be any vertical collisions with stacked diacritics. In the case of WenKai, that means the sTypoAscender value will fall about 90 units below the top of the accent on the top of Ǘ, which given that we're trying to compromise between two very different scripts, seems like a viable option. One will always need to take priority.

In Source Han Sans' case, that is mostly due to stacked diacritics for hanyu pinyin (Ǜ) and Vietnamese (Ẫ) in Heavy weight. This should probably be considered in C/J fonts too.

All the more reason to use sTypo metrics for defining the typographic line heights than usWin, which is beholden to extra space required by heavier weights.

A bettter vertical metrics wording/clarification update for C/J (instead of a full reworking) is proposed in googlefonts/googlefonts.github.io#135 .

If we divorce the setting of line height metrics from clipping metrics, then we can enable complete rendering of glyphs such as Ǘ even if they extend above the sTypo value as they will be within the clipping metrics of usWin. By using clipping metrics as line height metrics, in almost all cases we're going to end up with too-loose typesetting by default. I believe that users should be able to have more comfortable typesetting by default, and if they choose to adjust from there, that's up to them.

In re-reading these guidelines, I find it sort of interesting that the 'conventional' approach here doesn't even align with the spec recommendations—the hhea metric is supposed to be aligned with the sTypo metric in order to ensure consistent results across platforms (when BASE is not provided), but we all know that doing so (without setting USE_TYPO_METRICS) will actually result in inconsistencies. This is why I am looking at the standard approaches for using vertical metrics as defined in web browsers, and on Windows / Mac, as the guiding principle for setting these values.

@celestialphineas
Copy link

celestialphineas commented Jan 16, 2025

From @ButTaiwan, Japanese fonts uses 880/-120 for the sTypo values across all fonts to ensure em-box position when changing fonts inline; the Latin baseline is not fixed similar to what @celestialphineas mentioned for Chinese fonts. It can further be deduced it is common industrial practice in C/J to use the em-box for the sTypo values.

The BASE table is almost always present among Japanese vendors. 880/-120 is commonly used for sTypo metrics, but it's more like a design decision. And, again, the BASE table must be included; otherwise the font could not be used properly in Adobe applications. Below is a list of vertical metrics from three different font products belonging to three different vendors. Given that all these three are text faces, it's legit that they share identical Latin baseline in the designs.

ID sTypo hhea usWin BASE Latin baseline at y=0
1 880/-120/0 1128/-349/0 1128/349 -78/838/-120/0 ☑️
2 880/-120/1000 880/-120/1000 1279/296 -78/838/-120/0 ☑️
3 880/-120/1000 880/-120/1000 1201/299 -83/843/-120/0 ☑️

Out of interest, are either of you familiar with any apps (aside from Adobe) that make use of the sTypo metrics to decide vertical positioning? Aside from "it is convention", no one has been able to point me to specific examples. This drives me to think that it is worth having a conversation regarding if the convention continues to make sense. 😄

We haven't found an application be directly affected, as there are few applications that need to know the body em-box of a CJK font, which unfortunately means that many CJK typography requirements cannot be met. However, it does affect many scripts when we need to determine the body em-box, whether in Glyphs or in Python with PIL, since a lot of SC/TC fonts do not even have their vmtx or VORG set correctly. In such cases, sTypoDescender becomes the last chance to assume a correct body em-box.

@aaronbell
Copy link
Collaborator Author

@celestialphineas Thank you for the further information! Fascinating that font 2 and 3 have hhea aligned with the sTypo values, which would actually result in inconsistent cross-platform rendering. And that many fonts do not have proper vertical tables present. It just goes to show that there is a great diversity in approaches applied by different foundries.

While I realize it would likely be irritating to you to need to update your scripts away from sTypo, would you be happy to know that the BASE table is present, and could be relied upon, in every Chinese / Japanese font in Google's library to provide accurate data regarding the ideographic em-box? And also that vmtx is present, and accurate? Obviously you'd still need to account for any other random Open Source font out there, but if GF's library could be relied upon, I think that would be a win.

@celestialphineas
Copy link

celestialphineas commented Jan 16, 2025

While I realize it would likely be irritating to you to need to update your scripts away from sTypo, would you be happy to know that the BASE table is present, and could be relied upon, in every Chinese / Japanese font in Google's library to provide accurate data regarding the ideographic em-box? And also that vmtx is present, and accurate? Obviously you'd still need to account for any other random Open Source font out there, but if GF's library could be relied upon, I think that would be a win.

I'd certainly be happy to follow BASE and vmtx! :D

The only trouble is that legacy SC/TC fonts are a total mess to deal with, and sTypo is a convenient convention that has worked so well so far. The Glyphs app (@schriftgestalt) also seems to follow the convention, and not yet to support writing a BASE table.

I also wonder how urgent the cross-platform rendering impact of aligning hhea and sTypo would be, as the cross-platform behaviors of MS Word always seem a mystery that people have already got used to, and as for Web development, the line height is almost always specified in CSS.

I personally (but also practically) care much more about the behavior in Adobe applications, and therefore our foundry usually follows the Adobe convention, not necessarily the convention of Source Han, but sometimes also Adobe Heiti Std as an example. Things become even easier to us when the Latin part of a font is aligned to ideo + 150, as we do not even write a BASE table for such cases, and it seems to work magically fine automatically.

Another factor worth noting is the actual C/J typesetting requirements for line height/gap, are considered and determined based on a fixed point size of the body em-box. A "default" line height defined by the font does not make much sense in C/J typesetting. We usually expect typeface style changes without causing dramatic overall layout changes when switching fonts in professional typesetting scenarios (and that's also why MS Word is so confusing).

@aaronbell
Copy link
Collaborator Author

The only trouble is that legacy SC/TC fonts are a total mess to deal with, and sTypo is a convenient convention that has worked so well so far.

Yeah, totally understand you on that. My hope is that once the fonts have been "processed" by the onboarder to ensure alignment with the GF standard, that a lot of the mess will have been resolved, so it'll be easier for everyone :).

I also wonder how urgent the cross-platform rendering impact of aligning hhea and sTypo would be, as the cross-platform behaviors of MS Word always seem a mystery that people have already got used to, and as for Web development, the line height is almost always specified in CSS.

Yeah, MS Word does make things complex. When USE_TYPO_METRICS is enabled, Word on Mac uses hhea, Word on PC uses sTypo. So if we are enabling USE_TYPO_METRICS, then we have to align hhea to sTypo to ensure things work as expected. This is the same reason that when USE_TYPO_METRICS is disabled, hhea must follow usWin.

I personally (but also practically) care much more about the behavior in Adobe applications, and therefore our foundry usually follows the Adobe convention

That's helpful to know. I think for GF, we are looking to achieve improved rendering in a wider range of environments vs just Adobe, so we have to account for the more "default" approaches implemented in web browsers / text processors / etc.

Another factor worth noting is the actual C/J typesetting requirements for line height/gap, are considered and determined based on a fixed point size of the body em-box.

Thanks for the references. The note you pointed out, though, indicates that the lineGap is "commonly set" between 50% and 100% of the height of the character frame. That's a pretty wide range!

The line gap for the type area is commonly set to a value between 50% and 100% of the height of the character frame used for the type area. A shorter line gap can be chosen in cases where the line length is short or the character size of the type area is relatively small.

@rutopio
Copy link

rutopio commented Jan 16, 2025

I think lineGap value should be set to 0, because not all typesetting software supports negative line height values.

If we set lineGap value greater than 0, it could confuse users who require more compact layouts.

The real line height should be determined by the users, not predefined in the font metrics.

@NightFurySL2001
Copy link
Contributor

The method I suggested was attempting to optimize more for keeping the ideographs centered on the body—so it is based on the existing em-square rather than following the guidance for Latin, which is optimizing for ensuring that there will not be any vertical collisions with stacked diacritics. In the case of WenKai, that means the sTypoAscender value will fall about 90 units below the top of the accent on the top of Ǘ, which given that we're trying to compromise between two very different scripts, seems like a viable option. One will always need to take priority.

The current metrics can do the same thing too, albeit a bit too limiting (quite loosened in googlefonts/googlefonts.github.io#135). As you have mentioned it is only 90 units below, it would make more sense if the default line height metrics could just show the character at full height instead of having users to adjust it manually. Clipping should not happen at all in these cases, especially since it might involve displaying of names and loss of information.

The convention of setting sTypo as the em-box at least is an industrial standard in C/J font foundry. It would be unwise to change such convention abruptly just for web typesetting when it could break other typesetting programs (eg WPS/FounderType /whatever software C/J are using).

By using clipping metrics as line height metrics, in almost all cases we're going to end up with too-loose typesetting by default. I believe that users should be able to have more comfortable typesetting by default, and if they choose to adjust from there, that's up to them.

Another issue that comes up is what even is considered as "too-loose" or "too-tight". The usual convention for C/J typesetting is at least 1.5 to 2em of line height, but the proposed metrics introduced an abritrary value of 15–20% that is up to the personal preference of font designers. The proposal metrics could very well be viewed as "too-tight" for some end users. Given that line height will almost certainly be changed by web designers in the end, it would be better to leave the CJK metrics as-is, instead of enforcing extra height in the source font files.

Also if you view the Latin metrics part, would you say the current GF Latin metrics will end up with too-loose typesetting? It is similar to what CJK fonts have right now. A comparison of LXGW WenKai TC set according to the simpler proposal (loosened version of what GF has currently) at googlefonts/googlefonts.github.io#135 would be nice too, which is easier to calculate than the current proposal (just take the highest glyph or Ǘ exluding vertical-specific glyphs).

@aaronbell
Copy link
Collaborator Author

Clipping should not happen at all in these cases, especially since it might involve displaying of names and loss of information.

I think that there is some confusion. To make it clear, clipping will not occur at all. usWin is the clipping metric, and as long as it is set to yMin / yMax (or similar), the Ǘ will not be clipped, even if it extends beyond the sTypo metrics. The sTypo metrics are intended to be purely positional, not clipping-related.

The convention of setting sTypo as the em-box at least is an industrial standard in C/J font foundry.

I can understand that, but it is not clear to me what the actual benefit of doing so at this point in time. For example, there is an established industry convention in Japan of inserting the Yen symbol into the backslash but there is no reason to do that, and it goes against the Unicode standard itself, but many foundries do it because it is "convention".

Adobe CC does appear to make use of sTypo, but Adobe is also happy to prefer a BASE table if present. It might be a fallback for a lack of vertical metrics data, but as long as we ensure that those table are present, then that fallback should never be hit.

So it comes down to (a) Web / Word Processors, which (to my knowledge) will rely only on the standard metrics—usWin on PC (when USE_TYPO_METRICS is disabled) and hhea on Mac, and (b) other specialized applications which I am not familiar with which only look at sTypo.

To that end, rather than blindly follow industry convention, I think it is worth a conversation about practical use of the fonts, and how we can ensure optimal default performance for users.

Another issue that comes up is what even is considered as "too-loose" or "too-tight". The usual convention for C/J typesetting is at least 1.5 to 2em of line height, but the proposed metrics introduced an abritrary value of 15–20% that is up to the personal preference of font designers. The proposal metrics could very well be viewed as "too-tight" for some end users. Given that line height will almost certainly be changed by web designers in the end, it would be better to leave the CJK metrics as-is, instead of enforcing extra height in the source font files.

The 18% is completely arbitrary, and is based on the hhea values originally implemented in LXGW WenKai TC. I am not suggesting it as a hard requirement, but in providing an example of the approach. It should be set larger or smaller depending on the design of the project. If you look above, I've specifically mentioned the 15%–20% range mentioned in other documents:

OS/2.sTypoAscender | ideoEmboxTop + (15–20% * emBox)/2

OS/2.sTypoDescender | ideoEmboxBottom - (15–20% * emBox)/2

It made sense to me to determine the ideal C/J vertical metrics using the ideoEmbox (since that is the primary use for the fonts), but if there's a better method, I'm definitely open. That's the main reason for making this proposal public :).

Also if you view the Latin metrics part, would you say the current GF Latin metrics will end up with too-loose typesetting?
Yes. I would say that the current Latin metrics are too loose.

As a test, I put together a comparison, all at single line height in MS Word for Mac:

Image

The top is the current GF CJK requirement following Source Han Sans. The Latin (Pinyin) here is set too loose for comfortable reading. Google Fonts' current spec recommends a total vertical height of no more than 130% of the UPM, and following the current Source Han Sans recommendations will result in nearly 145%. Clearly too large. (note, I am observing that the ship version of LXGW WenKai doesn't actually follow the GF requirements. 🤦‍♂️ I must have overlooked this in my haste to prep that family for release... So I've updated it here to reflect the requirements properly.)

The middle is following the standard GF vertical metrics requirement of a max vert height of 130% UPM with USE_TYPO_METRICS enabled. The Chinese and Pinyin text look better, but is perhaps still a little too loose for this design given the short descenders. It also doesn't have the ideographs centered in the type body, which may or may not be an issue per user expectations—though this could be resolved by either increasing the sTypo descender depth value, or lowering the ascender value (though the Ắ would no longer be fully encapsulated then).

The bottom follows the 118% of the ideographic embox approach that I've used as a sample above, which is based on the hhea metrics values used in LXGW WenKai TC, and where the type body is centered on the ideographic embox. My initial feeling is that it is a touch on the tight side, but it does work. Maybe 120% could be better?

which is easier to calculate than the current proposal (just take the highest glyph or Ǘ exluding vertical-specific glyphs)

I can understand your perspective that implementing a formula for calculating the sTypo metrics might be adding unnecessary complication. A couple points:

  • One thing is that not all fonts may include full Pinyin support, or even any significant Latin coverage—this is a challenge on the Latin side too. So for C/J I thought it would make more sense to take an ideoEmBox approach, which is universal to C/J fonts.
  • Without some sort of % increase, simply taking the highest glyph or Ǘ could result in unnecessarily tight line heights. Say you had a font with just TC support and no Latin, or very limited Latin coverage, the tallest glyph could be less than the ideoEmBoxTop value. So using the ideoEmBox, at a minimum, would be better than otherwise. And applying a % increase on top of it will produce better results. So even if the formula adds some unnecessary complication for experienced designers, I think it would be helpful for beginners who don't necessarily know how to set these values.

@ButTaiwan
Copy link

In the last version of Iansui, USE_TYPO_METRICS was enabled. As a result, part of the descenders cannot be rendered correctly in some Android apps (though it renders correctly on iOS).
Some Windows/Android apps treat sTypo metrics not only as the logical line height value but also as the rendering range. This may be a problem. I am not sure if OS/2.sTypoDescender = ideoEmboxBottom - (15–20% * emBox)/2 is sufficient.

Image

@NightFurySL2001
Copy link
Contributor

One thing is that not all fonts may include full Pinyin support, or even any significant Latin coverage—this is a challenge on the Latin side too. So for C/J I thought it would make more sense to take an ideoEmBox approach, which is universal to C/J fonts.

Note that pinyin is mandatory in GB 2312 (although only lowercase, one of GF Fontbakery check requires uppercase equivalent to be in the font so it should be filled in anyway) and present in Adobe-Japan1-6.

I think that there is some confusion. To make it clear, clipping will not occur at all. usWin is the clipping metric, and as long as it is set to yMin / yMax (or similar), the Ǘ will not be clipped, even if it extends beyond the sTypo metrics. The sTypo metrics are intended to be purely positional, not clipping-related.

Given that there is still some programs (e.g. Microsoft Word) that treats sTypo as a clipping metrics, it would be better that if sTypo is enabled, it should include hanyu pinyin at the minimum least.

@NightFurySL2001
Copy link
Contributor

Also of note from Microsoft Docs in full:

The OS/2.sTypoDescender and OS/2.sTypoAscender fields in a CJK font may specify metrics different from the HorizAxis.ideo and HorizAxis.idtp values in the BASE table. However, CJK font developers should be aware that some applications might not read the BASE table at all but simply use the OS/2.sTypoDescender and OS/2.sTypoAscender fields to describe the bottom and top edges of the ideographic em-box. If developers want their fonts to work correctly with such applications, they should ensure that any ideographic em-box values in the BASE table of their CJK fonts describe the same bottom and top edges as the OS/2.sTypoDescender and OS/2.sTypoAscender fields.

which mentions that if sTypo is to be set away from the em-box, then the BASE table bottom/top edges should match sTypo too.

@aaronbell
Copy link
Collaborator Author

In the last version of Iansui, USE_TYPO_METRICS was enabled. As a result, part of the descenders cannot be rendered correctly in some Android apps (though it renders correctly on iOS). Some Windows/Android apps treat sTypo metrics not only as the logical line height value but also as the rendering range. This may be a problem. I am not sure if OS/2.sTypoDescender = ideoEmboxBottom - (15–20% * emBox)/2 is sufficient.

I might be a bit of a stick in the mud, but that's a failing of the app developer's code rather than one of the font.

BTW, in the case of Iansui, 118% would actually cover the descender value fully, though the highest peak of the pinyin wouldn't be covered, which may not be ideal for all scenarios.
Image

Given the requirement for Pinyin, maybe it would just be simpler to follow the standard non-CJK guidance here then? So the top of sTypo would be placed at the top of Ǘ, and then target 130% ish for the descender. For Iansui, that'd end up being 1032 / -268 versus 942 / -238 (with my method). It'd be looser than one might prefer, but at least everything would be covered.

Given that there is still some programs (e.g. Microsoft Word) that treats sTypo as a clipping metrics, it would be better that if sTypo is enabled, it should include hanyu pinyin at the minimum least.

I don't think that's true, at least of MS Word. I tested in both Word for Mac and Win Word and did not see any clipping on Ǻ, which exceeds the sTypo metrics:

Mac Word
Image

Win Word

Image

which mentions that if sTypo is to be set away from the em-box, then the BASE table bottom/top edges should match sTypo too.

If you know of any such applications (that ignore the usWin metrics and just use sTypo in that way), I'd be interested to hear! It is also worth noting that the Microsoft documentation specifies that the hhea values should align with the sTypo in documenting the ideographicEmBox, but that causes inconsistencies between Mac and PC (when USE_TYPO is disabled).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants