Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Translations #12

Open
rposborne opened this issue Mar 12, 2015 · 10 comments
Open

Incorrect Translations #12

rposborne opened this issue Mar 12, 2015 · 10 comments

Comments

@rposborne
Copy link

So thank you for this great gem. We have recently swapped out the countries gem to leverage the i18n_data and we have found some issues, that are rooted in some bad translations. What is the best ways to get these corrected? As the debian repo's aren't exactly "approachable".

See countries/countries#243 (comment)

@grosser
Copy link
Owner

grosser commented Mar 12, 2015

adding modifications to the dump rake task should be a simple way of doing it, I don't think anyone is using the live-data anyway ...

@rposborne
Copy link
Author

Certainly so I did a quick run through on our side, and it looks to be a bit more widespread then i thought. Mainly the XH translation is covered in duplicates.

Basically any name that is used twice for a given translation is a smell wouldn't you say?

@grosser
Copy link
Owner

grosser commented Mar 12, 2015

yep, sounds good

On Thu, Mar 12, 2015 at 1:31 PM, rpo [email protected] wrote:

Certainly so I did a quick run through on our side, and it looks to be a
bit more widespread then i thought. Mainly the XH translation is covered in
duplicates.

Basically any name that is used twice for a given translation is a smell
wouldn't you say?


Reply to this email directly or view it on GitHub
#12 (comment).

@tilo
Copy link

tilo commented Mar 13, 2015

here are two of the issues -- they are both in the XH file!!
Someone messed-up, and probably checked-in half translated files???

$ grep BZ cache/file_data_provider/countries* | grep Belg
cache/file_data_provider/countries-XH.txt:BZ;;Belgium

$ grep UG cache/file_data_provider/countries* | grep Cana
cache/file_data_provider/countries-XH.txt:UG;;Canada

let's see..

$ git blame cache/file_data_provider/countries-XH.txt | grep Cana
a63d006 (grosser 2012-06-20 09:31:15 -0700 40) CA;;Canada
f875daf (grosser 2009-01-18 12:20:41 +0100 184) RW;;Canada
a63d006 (grosser 2012-06-20 09:31:15 -0700 232) UG;;Canada

$ git blame cache/file_data_provider/countries-XH.txt | grep Belg
a63d006 (grosser 2012-06-20 09:31:15 -0700 22) BE;;Belgium
a63d006 (grosser 2012-06-20 09:31:15 -0700 23) BZ;;Belgium

So is XH an auto-generated file?

Would it be a safe heuristic/test to say that the same string for a country can not map to more than one Alpha2 code? At least for some languages?

@grosser
Copy link
Owner

grosser commented Mar 13, 2015

idk .. just hit regenerate and see if that fixes it ?

On Thu, Mar 12, 2015 at 10:07 PM, Tilo [email protected] wrote:

What's XH anyhow? it's user assigned..


Reply to this email directly or view it on GitHub
#12 (comment).

@rposborne
Copy link
Author

so I can confirm that the bugs are upstream in the pkg_isocodes. My favorite is the US virgin Islands are called "Finland"

Canda
http://anonscm.debian.org/cgit/pkg-isocodes/iso-codes.git/tree/iso_3166/xh.po#n1347

Finland
http://anonscm.debian.org/cgit/pkg-isocodes/iso-codes.git/tree/iso_3166/xh.po#n1783

The XH language appears to be south african but there is no way this translation file is "quality". http://en.wikipedia.org/wiki/Xhosa_language

@grosser
Copy link
Owner

grosser commented Mar 13, 2015

maybe using cldr as datasource would give better results :(

On Fri, Mar 13, 2015 at 6:43 AM, rpo [email protected] wrote:

so I can confirm that the bugs are upstream in the pkg_isocodes. My
favorite is the US virgin Islands are called "Finland"

Canda

http://anonscm.debian.org/cgit/pkg-isocodes/iso-codes.git/tree/iso_3166/xh.po#n1347

Finland

http://anonscm.debian.org/cgit/pkg-isocodes/iso-codes.git/tree/iso_3166/xh.po#n1783

The XH language appears to be south african but there is no way this
translation file is "quality". http://en.wikipedia.org/wiki/Xhosa_language


Reply to this email directly or view it on GitHub
#12 (comment).

@rposborne
Copy link
Author

Maybe so. Regardless CLDR seems much more approachable then the debian source does. It comes down in JSON which is nice 😄

@rposborne
Copy link
Author

@grosser i would almost kill the XH translation all together... it doesn't even look like anything is translated.

@grosser
Copy link
Owner

grosser commented Mar 17, 2015

go crazy :)

On Tue, Mar 17, 2015 at 1:19 AM, rpo [email protected] wrote:

@grosser https://github.com/grosser i would almost kill the XH
translation all together... it doesn't even look like anything is
translated.


Reply to this email directly or view it on GitHub
#12 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants