Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.0] Languages Overhaul #7853

Merged
merged 35 commits into from
Jan 30, 2024

Conversation

jdarwood007
Copy link
Member

@jdarwood007 jdarwood007 commented Nov 5, 2023

Move the languages directory form /Themes/default/languages to /Languages
Additionally, this also moves agreement.txt from / to /Languages

  • We introduce a new config variable for Settings.php, $languagesdir which defaults to __DIR__ . '/Languages'
  • We now use the locale instead of a custom name
  • Languages are now in a sub-directory. I.e. /Languages/en_US/
  • We no longer use a suffix on filenames (i.e. we no longer have index.english.php)
  • Rename index to General
  • This does not change that we can still have languages in /Themes/*/languages, we do deprecate the usage of this for override.
  • Upgrader will attempt to clean up the old languages directory, it's a best effort. I coded this in a way for us to add additional cleanup steps in the future.
  • We will rename Settings.{language} to {locale}/ThemeStrings to encourage themes to use this as an alternative to modifying base languages

@DiegoAndresCortes Can you review the theme's logic is sound.

Todo:

  • Upgrade's convert UTF-8 logic needs to handle the new locales
  • Test installing a mod that is using $themesdir/default/languages
  • TSources/PackageManager/SubsPackage.php still calls $languagedir (and $languages_dir) with the old paths. The issue is how we handle the logic for migration to the 'new' paths.
  • Test installing a mod that uses $languagedir

@Sesquipedalian
Copy link
Member

I've moved this to the 3.0 Alpha 2 milestone. For Alpha 1, the only goals are:

  • Merge OOP code
  • Find and fix bugs introduced via OOP refactoring. (Goal is to have all functionality just like in SMF 2.1.)
  • Basic installer and upgrader updates. (Create upgrade_3.0_*.sql files, etc.)

@Sesquipedalian
Copy link
Member

Sesquipedalian commented Nov 5, 2023

The syntax check failed, but that should be an easy fix. Looks like you're just missing some trailing commas.

In order to pass the build tools check, you'll need to update the BuildTools repository so that the scripts look in the correct place for the language files depending on the version, and then update composer.lock so that it uses the correct version of BuildTools for the checks.

Sources/Theme.php Outdated Show resolved Hide resolved
@jdarwood007
Copy link
Member Author

Antechinus suggested we do a per directory language. So we would do /Languages/english/index.english.php. We could drop the language out of the filename as well.

This may also be a good time to discuss. Should we also go locale for our naming instead of our custom method. So index.english.php becomes index.en-us.php. @Dzonny any opinion on that?

@DiegoAndresCortes
Copy link
Member

I like the idea and support it.
Could we still support subdirectories? E.g languages/english/LargeMod/

@jdarwood007
Copy link
Member Author

You have an idea of what you think it would be the best source of truth regarding how we associate a locale to a language?
Use the unicode data? https://www.unicode.org/cldr/charts/42/delta/en.html
Use Crowdin data? https://developer.crowdin.com/language-codes/

You can already do sub folders with languages. Just call it with the folder name. SimpleDesk already do this:

load_language('sd_language/SimpleDeskAdmin');

@DiegoAndresCortes
Copy link
Member

DiegoAndresCortes commented Nov 10, 2023

You can already do sub folders with languages. Just call it with the folder name. SimpleDesk already do this:

load_language('sd_language/SimpleDeskAdmin');

Yes I'm aware of that I do the same and Bugo does it too. I was referring to the folder per language format. For example, a MOD would now have this structure:

  • /languages/
    • /english/SimpleDesk/
    • /spanish/SimpleDesk/
      Is this correct?

You have an idea of what you think it would be the best source of truth regarding how we associate a locale to a language?

I know some of our language translations are mixed in this regard, so I'm not sure either which one would be preferred.
The spanish_latin translation is using Crowdin locale for some reason, es-419. I personally like unicode more.

@Sesquipedalian
Copy link
Member

The CLDR is standard. Using it to designate our locales and languages will greatly simplify the interactions between our language files and other PHP functions.

@jdarwood007
Copy link
Member Author

jdarwood007 commented Nov 10, 2023

@DiegoAndresCortes Correct, they would be that way. Although with the locale discussion it would be /en-us/Simpledesk and /es-mx/SimpleDesk.

@Sesquipedalian I agree, which is why I suggested it. We would also be able to build in support to read the users locale in their request, check it against our known language list of locales, and then change to that language. No funky workarounds needed. I don't think I need to target that with this PR, but moving us over to en-us and the folder structure seems to be a win all around that we can make for this major release.
I think I see you pointed out that we should follow the CLDR standards. I didn't know if Crowdin's differed enough for us to want to use theirs, or if we are happy with what Crowdin has and we can just go forward with using what they have.

@Sesquipedalian
Copy link
Member

Yeah, I would want us to follow the CLDR rather than any vendor-specific variant.

Reorganizing all the language files into subdirectories named after CLDR locale strings is an excellent plan.

I would prefer to see the same structure enforced for theme-specific language files as well. Stricter consistency is actually easier for everyone to understand and work with in the long run.

@jdarwood007 jdarwood007 marked this pull request as draft November 11, 2023 00:49
@jdarwood007
Copy link
Member Author

Moving this to draft. I still need to finish the upgrader work. The upgrader will have to handle updating the members table with a old => new.
I also need to test editing and copying functions that are handled under the Themes admin action.

This got things working for me.

@Sesquipedalian
Copy link
Member

One thing: It would be best to use the canonical forms of locale labels in all cases. The canonical forms use underscores rather than hyphens between label parts, and the country code part is always uppercase. So, en_US rather than en-us.

An easy way to get the canonical form of any locale label is by using Locale::Canonicalize.

(I also posted this comment on the related BuildTools pull request.)

@jdarwood007
Copy link
Member Author

I think we can make that work. I wanted it to be able to handle the browser information as sent by the browser. Appears to be able to work:

<?php
var_dump($_SERVER['HTTP_ACCEPT_LANGUAGE']);

$langs = array_map(function($l) {
	[$v,] = explode(';', $l);
	return $v;
}, explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']));

var_dump($langs);

foreach ($langs as $locale)
	var_dump(Locale::canonicalize($locale));

Passing in en-US,en;q=0.9 returns en_US on the first result.

@jdarwood007
Copy link
Member Author

I'm going to need help from @Dzonny and the rest of our translators. To handle the upgrade logic, we need to convert our custom naming format to the proper locales.
What I have so far, which is most likely wrong:

	'english' => 'en_US',
	'afrikaans' => 'af_ZA',
	'albanian' => 'sq_AL',
	'arabic' => 'ar_SA',
	'bulgarian' => 'bg_BG',
	'catalan' => 'ca_ES',
	'chinese_simplified' => 'zh_CN',
	'chinese_traditional' => 'zh_TW',
	'croatian' => 'hr_HR',
	'czech_informal' => 'cs',
	'czech' => 'cs_CZ',
	'danish' => 'da_DK',
	'dutch' => 'nl_NL',
	'english_british' => 'en_GB',
	'esperanto' => 'eo_UY',
	'estonian' => 'et_EE',
	'finnish' => 'fi_FI',
	'french' => 'fr_FR',
	'galician' => 'gl_ES',
	'german_informal' => 'de',
	'german' => 'de_DE',
	'greek' => 'el_GR',
	'hebrew' => 'he_IL',
	'hungarian' => 'hu_HU',
	'indonesian' => 'id_ID',
	'italian' => 'it_IT',
	'japanese' => 'ja_JP',
	'kurdish_kurmanji' => 'kmr_TR',
	'lithuanian' => 'lt_LT',
	'macedonian' => 'mk_MK',
	'malay' => 'ms_MY',
	'norwegian' => 'no_NO',
	'persian' => 'fa_IR',
	'polish' => 'pl_PL',
	'portuguese_brazilian' => 'pt_BR',
	'portuguese_pt' => 'pt_PT',
	'romanian' => 'ro_RO',
	'russian' => 'ru_RU',
	'serbian_cyrillic' => 'sr_SP',
	'serbian_latin' => 'sr_CS',
	'slovak' => 'sk_SK',
	'slovenian' => 'sl_SI',
	'spanish_es' => 'es_ES',
	'spanish_latin' => 'es_MX',
	'swedish' => 'sv_SE',
	'thai' => 'th_TH',
	'turkish' => 'tr_TR',
	'ukrainian' => 'uk_UA',
	'urdu' => 'ur_PK',
	'vietnamese' => 'vi_VN',
	'welsh' => 'cy_GB',
	'armenian_east' => 'hy',
	'armenian_west' => 'hy',
	'basque' => 'eu_ES',
	'cambodian' => 'km_KH',
	'bangla' => 'bn',
	'bosnian' => 'bs_BA',
	'chinese-simplified' => 'zh_Hans',
	'chinese-traditional' => 'zh_Hant',
	'filipino_tagalog' => 'fil',
	'filipino_visayan' => 'fil',
	'georgian' => 'ka_GE',
	'hindi' => 'hi',
	'icelandic' => 'is',
	'korean' => 'ko_KR',
	'kurdish_sorani' => 'ckb_IR',
	'lao' => 'lo',
	'latvian' => 'lv_LV',
	'malayalam' => 'ml_IN',
	'mongolian' => 'mn',
	'sakha' => 'sah',
	'sinhala' => 'si',
	'spanish' => 'es',
	'telugu' => 'te',
	'turkmen' => 'tk',
	'uzbek_latin' => 'uz_UZ',
	'serbian' => 'sr',
	'filipino_taglog' => 'fil',
	'filipino' => 'fil',
	'lituanian' => 'lt',
	'catala' => 'ca',
	'norsk' => 'nn',
	'brazilian' => 'es_BR',
	'portuguese' => 'pt',
	'azerbaijani' => 'az_AZ',
	'belarusian' => 'be_BY',
	'english_pirate' => 'en_PT',
	'uyghur' => 'ug',

Going to start with the localization team and go from there.

@jdarwood007 jdarwood007 changed the title Language dir move Languages Overhaul Nov 12, 2023
@jdarwood007
Copy link
Member Author

I realize that list will have to be in the source code. To maintain compatibility with older mods, we will have to handle the old languages. They would also install into the root language folders, instead of in the sub directories. But we only really need to do this for non system files or mods that are not fully compatible.

@Sesquipedalian I think it would be best if I triggered something that only happened if we call from a combat situation. Seems like we could modify the loadLanguage legacy call to pass an additional param, which would then allow the fallback.

Copy link
Member

@DiegoAndresCortes DiegoAndresCortes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No testing yet from me atm, but this also needs to be removed:

// Set the following variable to true if this theme requires the optional theme strings file to be loaded.
Theme::$current->settings['require_theme_strings'] = false;

@jdarwood007 jdarwood007 changed the title Languages Overhaul [3.0] Languages Overhaul Jan 7, 2024
@jdarwood007 jdarwood007 force-pushed the languageDirMove branch 4 times, most recently from ba3ca35 to a93bf38 Compare January 22, 2024 02:07
…ion data

Fix issue with users that have a session language unable to affect the forum language when changing with ?language
Fix issue that locale detection affects logged in users who have defined a language
@jdarwood007 jdarwood007 marked this pull request as ready for review January 27, 2024 21:13
@jdarwood007
Copy link
Member Author

@Sesquipedalian I feel this is ready now.

@Sesquipedalian
Copy link
Member

Cool. I'll review it in the next day or two.

@Sesquipedalian
Copy link
Member

Sesquipedalian commented Jan 29, 2024

I found a few issues. It was easier to fix them than it was to document them all in review comments. How do my changes look to you, @jdarwood007?

@jdarwood007
Copy link
Member Author

Looks good. I thought I had tested the language editor, but I must have broken it since I last tested it. Do we need another set of eyes on this since you made changes?

@jdarwood007
Copy link
Member Author

Also note, that I haven't reviewed the changes we need to make to Crowdin. We haven't even discussed that yet on how we handle this. I will get a discussion going on the team boards as we also need to get the branch going in Crowdin.

@Sesquipedalian
Copy link
Member

Looks good. I thought I had tested the language editor, but I must have broken it since I last tested it. Do we need another set of eyes on this since you made changes?

If you can just take a moment to make sure that file select box populates correctly on your machine, that should be enough.

Also note, that I haven't reviewed the changes we need to make to Crowdin. We haven't even discussed that yet on how we handle this. I will get a discussion going on the team boards as we also need to get the branch going in Crowdin.

A discussion will be good. I had no involvement with setting up the CrowdIn integration, so I have no more idea than the next guy about what needs to be done there. As I recall, @sbulen was the primary developer involved with that at the time, so maybe start the discussion in the Development Contributors board so that he can participate in the discussion.

@jdarwood007
Copy link
Member Author

Sent up a small fix. The charset and locale didn't show correctly in the list

Screen Shot 2024-01-29 at 4 22 42 PM

Because $txt is local, not from Lang::$txt

@Sesquipedalian
Copy link
Member

Nice catch. I think we can go ahead a merge this now.

@Sesquipedalian Sesquipedalian merged commit 3519459 into SimpleMachines:release-3.0 Jan 30, 2024
3 checks passed
@live627 live627 added Localization Language & internationalization and removed Language labels Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Localization Language & internationalization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants