-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure all strings are valid and don't break builds #549
Comments
@brian-smith-tcril I looked up for tools to validate JSON translation files, but found none online. Does it make sense to convert json to po-files and validate them? I'm not sure if there's a two-way conversion between the two format. I know that Apache superset has a po2json utility, but I haven't tested it. |
@OmarIthawi it looks like translate-toolkit could work, it supports converting to and from json. I was hoping there would be a way to validate without converting out of json by just using the formatjs cli directly, but that doesn't seem to be possible. I'm not sure if there are any strange edge cases that will trip up the validation with the conversion back and forth, but I think trying out validation with translate-toolkit's json2po and po2json is a reasonable path forward. |
It seems that there has been a regression in validation. For example, see this translation where the slot/param name was translated: We have tooling in https://github.com/openedx/i18n-tools that's supposed to catch this sort of thing. It currently picks up about 300 errors in edx-platform alone. Is there some reason this is no longer in use? |
We sometimes see rendering errors in the error page itself, which then cause another attempt at rendering the error page. By catching all errors raised during error-page render and substituting in a hardcoded string, we can reduce server resources, avoid logging massive sequences of recursive stack traces, and still give the user *some* indication that yes, there was a problem. This should help address #35151 At least one of these rendering errors is known to be due to a translation error. There's a separate issue for restoring translation quality so that we avoid those issues in the future (openedx/openedx-translations#549) but in general we should catch all rendering errors, including unknown ones. Testing: - In `lms/envs/devstack.py` change `DEBUG` to `False` to ensure that the usual error page is displayed (rather than the debug error page). - Add line `1/0` to the top of the `student_dashboard` function in `common/djangoapps/student/views/dashboard.py` to make that view error. - In `lms/templates/static_templates/server-error.html` replace `static.get_platform_name()` with `None * 7` to make the error template itself produce an error. - Visit <http://localhost:18000/dashboard>. Without the fix, the response takes 10 seconds and produces a 6 MB, 85k line set of stack traces and the page displays "A server error occurred. Please contact the administrator." Without the fix, the response takes less than a second and produces three stack traces (one of which contains the error page's rendering error).
We sometimes see rendering errors in the error page itself, which then cause another attempt at rendering the error page. I'm not sure _exactly_ how the loop is occurring, but it looks something like this: 1. An error is raised in a view or middleware and is not caught by application code 2. Django catches the error and calls the registered uncaught error handler 3. Our handler tries to render an error page 4. The rendering code raises an error 5. GOTO 2 (until some sort of server limit is reached) By catching all errors raised during error-page render and substituting in a hardcoded string, we can reduce server resources, avoid logging massive sequences of recursive stack traces, and still give the user *some* indication that yes, there was a problem. This should help address #35151 At least one of these rendering errors is known to be due to a translation error. There's a separate issue for restoring translation quality so that we avoid those issues in the future (openedx/openedx-translations#549) but in general we should catch all rendering errors, including unknown ones. Testing: - In `lms/envs/devstack.py` change `DEBUG` to `False` to ensure that the usual error page is displayed (rather than the debug error page). - Add line `1/0` to the top of the `student_dashboard` function in `common/djangoapps/student/views/dashboard.py` to make that view error. - In `lms/templates/static_templates/server-error.html` replace `static.get_platform_name()` with `None * 7` to make the error template itself produce an error. - Visit <http://localhost:18000/dashboard>. Without the fix, the response takes 10 seconds and produces a 6 MB, 85k line set of stack traces and the page displays "A server error occurred. Please contact the administrator." Without the fix, the response takes less than a second and produces three stack traces (one of which contains the error page's rendering error).
We sometimes see rendering errors in the error page itself, which then cause another attempt at rendering the error page. I'm not sure _exactly_ how the loop is occurring, but it looks something like this: 1. An error is raised in a view or middleware and is not caught by application code 2. Django catches the error and calls the registered uncaught error handler 3. Our handler tries to render an error page 4. The rendering code raises an error 5. GOTO 2 (until some sort of server limit is reached) By catching all errors raised during error-page render and substituting in a hardcoded string, we can reduce server resources, avoid logging massive sequences of recursive stack traces, and still give the user *some* indication that yes, there was a problem. This should help address #35151 At least one of these rendering errors is known to be due to a translation error. There's a separate issue for restoring translation quality so that we avoid those issues in the future (openedx/openedx-translations#549) but in general we should catch all rendering errors, including unknown ones. Testing: - In `lms/envs/devstack.py` change `DEBUG` to `False` to ensure that the usual error page is displayed (rather than the debug error page). - Add line `1/0` to the top of the `student_dashboard` function in `common/djangoapps/student/views/dashboard.py` to make that view error. - In `lms/templates/static_templates/server-error.html` replace `static.get_platform_name()` with `None * 7` to make the error template itself produce an error. - Visit <http://localhost:18000/dashboard>. Without the fix, the response takes 10 seconds and produces a 6 MB, 85k line set of stack traces and the page displays "A server error occurred. Please contact the administrator." With the fix, the response takes less than a second and produces three stack traces (one of which contains the error page's rendering error).
@timmc-edx We have tests in place that is supposed to match openedx-translations/scripts/validate_translation_files.py Lines 32 to 36 in e9a7fe1
I couldn't make i18n-tools work with non-standard file structure such as What do you suggest? |
I don't know exactly what msgfmt checks, but I think it's just basic file structure. i18n_tool checks a variety of other things, including whether there are the same parameter slots in the inputs and outputs. You can see all the things it checks here: https://github.com/openedx/i18n-tools/blob/master/i18n/validate.py I think it would be appropriate to modify that tool to allow different directory structures (or even just expect a different one). |
Yes, it seems like I'll keep this issue open for now. Unless someone works on it, I plan to fit within my Open Source contributions late August / early September. |
For reference, here are the strings that |
Thanks @timmc-edx. That's mostly machine translation issues. I recall the Arabic strings to be of high quality and haven't had such issues in the past. The solution I have in mind is to:
Fixing them by hand all at once might be prohibitively costly. What do you think? |
I think we could do it by hand -- I bet there's a way to bulk find-and-unapprove translations in Transifex, and we could use search strings like But yeah, I think the first step would be adding i18n_tool to the CI step so that translation updates are blocked until they get cleaned up. |
@timmc-edx there's a way to bulk review/unreview/delete transaltions. But it'll be one language at a time. I have capacity issues at the moment regarding my contributions time i.e. I've done much more than planned this month and I need to focus on other projects at hand.
Yes. If help is there, I'd really appreciate it. Otherwise I'd have to clear my plate a bit. |
We sometimes see rendering errors in the error page itself, which then cause another attempt at rendering the error page. I'm not sure _exactly_ how the loop is occurring, but it looks something like this: 1. An error is raised in a view or middleware and is not caught by application code 2. Django catches the error and calls the registered uncaught error handler 3. Our handler tries to render an error page 4. The rendering code raises an error 5. GOTO 2 (until some sort of server limit is reached) By catching all errors raised during error-page render and substituting in a hardcoded string, we can reduce server resources, avoid logging massive sequences of recursive stack traces, and still give the user *some* indication that yes, there was a problem. This should help address openedx#35151 At least one of these rendering errors is known to be due to a translation error. There's a separate issue for restoring translation quality so that we avoid those issues in the future (openedx/openedx-translations#549) but in general we should catch all rendering errors, including unknown ones. Testing: - In `lms/envs/devstack.py` change `DEBUG` to `False` to ensure that the usual error page is displayed (rather than the debug error page). - Add line `1/0` to the top of the `student_dashboard` function in `common/djangoapps/student/views/dashboard.py` to make that view error. - In `lms/templates/static_templates/server-error.html` replace `static.get_platform_name()` with `None * 7` to make the error template itself produce an error. - Visit <http://localhost:18000/dashboard>. Without the fix, the response takes 10 seconds and produces a 6 MB, 85k line set of stack traces and the page displays "A server error occurred. Please contact the administrator." With the fix, the response takes less than a second and produces three stack traces (one of which contains the error page's rendering error).
@OmarIthawi mentioned an example where invalid strings could break a build here.
When the strings were checked into the repos directly, this would fail the individual repo CI. With the switch to
openedx-translations
/openedx-atlas
, strings updated in transifex and synced toopenedx-translations
will not trigger a CI build on any given repo.We'll want to ensure we don't have strings that break builds checked into
openedx-translations
, so we should add validation somewhere.In the same comment, @OmarIthawi mentioned
That seems like a good plan to me
Tasks
The text was updated successfully, but these errors were encountered: