-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take into account encoding of source file for syntax error #124188
Labels
3.12
bugs and security fixes
3.13
bugs and security fixes
3.14
new features, bugs and security fixes
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
topic-C-API
Comments
serhiy-storchaka
added
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
topic-C-API
3.12
bugs and security fixes
3.13
bugs and security fixes
3.14
new features, bugs and security fixes
labels
Sep 17, 2024
serhiy-storchaka
added a commit
to serhiy-storchaka/cpython
that referenced
this issue
Sep 17, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines.
serhiy-storchaka
added a commit
that referenced
this issue
Sep 24, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines.
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this issue
Sep 24, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines. (cherry picked from commit e2f7107) Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka
added a commit
to serhiy-storchaka/cpython
that referenced
this issue
Sep 24, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines. (cherry picked from commit e2f7107) Co-authored-by: Serhiy Storchaka <[email protected]>
serhiy-storchaka
added a commit
that referenced
this issue
Sep 24, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines. (cherry picked from commit e2f7107)
serhiy-storchaka
added a commit
that referenced
this issue
Oct 7, 2024
* Detect source file encoding. * Use the "replace" error handler even for UTF-8 (default) encoding. * Remove the BOM. * Fix detection of too long lines if they contain NUL. * Return the head rather than the tail for truncated long lines. (cherry picked from commit e2f7107) Co-authored-by: Serhiy Storchaka <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
3.12
bugs and security fixes
3.13
bugs and security fixes
3.14
new features, bugs and security fixes
interpreter-core
(Objects, Python, Grammar, and Parser dirs)
topic-C-API
Currently most syntax errors raised in the compiler (except these raised in the parser) use
PyErr_ProgramTextObject()
to get the line of the code. It does not know the encoding of the source file and interpret it as UTF-8 (failing if it contain non-UTF-8 sequences). The parser uses_PyErr_ProgramDecodedTextObject()
.There are two ways to solve this issue:
PyErr_ProgramTextObject()
. Since the latter is in the public C API, this can also affect the third-party code.There are other issues with
PyErr_ProgramTextObject()
:This all applies to
PyErr_ProgramText()
as well.Linked PRs
The text was updated successfully, but these errors were encountered: