-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU4X objects that try_new with a provider should store and expose the resolved locale #3906
Comments
What is the "primary provider-backed payload"? It's not well defined in all cases. The whole concept of the "resolved locale" is fraught and it's not up to ICU4X to perpetuate it. #58 has suggestions for how to solve the 402 problem in user land. |
For the collator, it would be the payload for
My understanding of provider internals is insufficient for understanding what exactly #58 proposes for the ECMA-402 resolved locale info, especially in the case where the glue code opted to implement "best fit" by delegating to ICU4X's lookup mechanism. Should an application do what Boa does, and try to load data payloads ahead of actual ICU4X object instantiation with the assumption that the preflight with match the actual instantiation or should the application load whatever is considered the primary payload lazily after the fact if the code calling ECMA-402 APIs requests the resolved options? Either way, which key should the glue code query for if the primary payload for a given ICU4X object isn't well defined in all cases? |
I think it would be fine if ICU4X suggested which key to use for the resolved locale in cases required by 402
Neither. The data provider should be instrumented to get the resolved locale out of the DataResponseMetadata. |
Does it make sense to merely suggest it as opposed to providing a concrete crate for it with documentation that the crate is only provided for 402 compat? Or providing a Cargo option to enable such code in each component directly? I wouldn't mind if the option was named to discourage use along the lines of
The example seems to preclude the use of the baked-mode constructors that don't take a provider argument, which is unfortunate. I'm not sure, but my initial reaction is that I'd rather do a duplicative lookup if the JS app looks at the resolved options than defeat the baked code path for object construction. |
Would it be incorrect to always return the requested locale as the resolved locale? |
That question can be understood in at least three senses: 1) what the caller wants to know if they care to actually inspect the resolved locale, 2) what's Web-compatible, 3) what fits within the spec's notion of implementation-defined. In sense 1, incorrect. (Most notably, if the requested locale has a non-language component and the resolved locale does not retain that component, this shows that the implementation's data does not explicitly alter the main flavor of the language in a way that the component would change. Is this actionable information for the caller? Perhaps not.) In sense 2, probably not Web-compatible considering that things that deviate from what major browsers do tends not to be Web-compatible but maybe Web-compatible in the sense that the information isn't really that actionable anyway. In sense 3, maybe not strictly incorrect if you read all implementation-defined behavior as not required to even make sense and the observer not getting an infinite number of observations. |
Returning the requested locale as the resolved locale gives the same result as preresolving locales at datagen time. If we don't have The bigger problem is falling back to I agree that a solution that is compatible with compiled data would be preferrable. |
I didn't look at the source, but I think |
Do you know any concrete uses of this information, which would break if we deviate? I don't want to let Chrome dictate how standards should be interpreted. |
From a very quick look at GitHub search, I see one use case beyond test cases and debug logging: Determining the host locale by executing So perhaps just echoing back the requested locale could work and not break the Web. Of course, this only looks at the case where |
Yeah, echoing back the requested locale probably works. I think the most useful piece of information you can get is which one out of a list of locales you got. For example, if the locales requested were |
Should ECMA-402 change to require this? |
@zbraniecki Thoughts on the above? |
The web reality is that it will return the closest locale the engine had data for:
|
The key question is whether ICU4X should push the first implementation to ship ICU4X-backed ECMA-402 to the Web to bear the cost of finding out if deviating from the current Web reality is Web-compatible. Given that ICU4X is supposed to work as an ECMA-402 back end, it would be rather odd for ICU4X to resist being able to implement what ECMA-402 currently says in a similar way to how deployed implementations do it. Perhaps echoing back the requested locale would work, but is e.g. Chrome willing to try it out to see if it's Web-compatible? I suggest doing what I originally requested but behind a repulsively-named Cargo option. That is, if |
|
@jedel1043 , see above:
|
Personally, I think it's alright if Boa has to deviate from V8 in order to offer better locale results. |
Based on tc39/ecma402#830 (comment), there could a way for datagen to store the list of locales for which it generated data and then some API to access that list. Needs design work. Discuss with: Optional: |
Can we merge the discussion into #58? It seems to be going the same direction |
In favor of merging. |
Not exactly the same set of questions. This thread is about ResolvedLocale and #58 is about SupportedLocales. The solutions may overlap. |
In #58 we have concluded that exposing a set of supported locales is not feasible, but determining the resolved locale is. You literally have a draft ResolvedLocalesAdapter for that issue, so I really struggle to understand what the difference is. |
I changed #4607 to be closing this issue. |
|
Also see comment from @mihnita in #2237 (comment) |
2.0 blocking question: does baked data do this? From preliminary tests there might a non-trivial size impact, I'll get some better numbers. |
I still think
|
|
ECMA-402 requires various objects to be able to expose the resolved options. Common across different types is the resolved locale.
For ECMA-402 compat, we should make various ICU4X objects call
take_metadata_and_payload
instead oftake_payload
when loading their primary provider-backed payload and store theDataLocale
from the metadata. We should then have a convention across ICU4X for retrieving thatLocale
from the ICU4X object.The finer points of
DataLocale
vs.Locale
are unclear to me, so I'm not sure if the convention should befn resolved_locale(&self) -> &DataLocale
allowing the application to call.into_locale()
orfn resolved_locale(&self) -> Locale
.The text was updated successfully, but these errors were encountered: