-
Notifications
You must be signed in to change notification settings - Fork 630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix utf8 handling in C++ with the default Std IO streams #1965
base: main
Are you sure you want to change the base?
Fix utf8 handling in C++ with the default Std IO streams #1965
Conversation
src/lib/OpenEXR/ImfStdIO.cpp
Outdated
@@ -27,15 +27,25 @@ namespace | |||
inline ifstream* | |||
make_ifstream (const char* filename) | |||
{ | |||
return new ifstream (std::filesystem::path (filename), | |||
#if __cplusplus >= 202002L | |||
return new ifstream (std::filesystem::path (std::u8string ((const char8_t*)filename)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is a safe cast, the sign of char is platform specific. See e.g. this discussion, it would seem there is still some clarifications to be done in the standard re UTF-8; it seems somehow broken/incomplete in C++20: https://stackoverflow.com/questions/57603013/how-to-safely-convert-const-char-to-const-char8-t-in-c20
It seems to say it's best to just keep using the C++17 u8path...
Re C++20 and future OpenEXR API, they seem to suggest adding a UTF-8 specific make_ifstream (const char8_t* filename)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite disappointed what the standards committee has done in this regard. While I don't think the signedness is a particular concern, the strict aliasing rules violation probably does have some semantic effect for the optimizers. As a result, until there is more clarity, I'll just remove the c++20 path for now, as there is no VFX reference platform specifying it yet anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, just switched it to a view w/ transform, which is still disappointing, but at least avoids the aliasing issue and actually an extra string copy from what was there.
eddf7a6
to
fcd1678
Compare
the std::filesystem::path mechanism is reasonable, but needs to be told it to look for a utf-8 string, and is not automatically that, and the standards committee must have realized there were issues and change the api in c++20 Signed-off-by: Kimball Thurston <[email protected]>
fcd1678
to
3f19ca4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I can tell the CI failures are meaningless with respect to this PR. lgtm.
// class IStream | ||
// class IStre | ||
// am |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert?
|
||
void testUpdateMeta (const std::string& tempdir); | ||
|
||
void testWriteScans (const std::string& tempdir); | ||
void testWriteTiles (const std::string& tempdir); | ||
void testWriteMultiPart (const std::string& tempdir); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert unnecessary change?
The use of the new std::filesystem::path to cleanup messy filebuf manipulations to create a stream was incorrectly not using u8path under windows. This also adds future support for C++20 where they change the API.
Adds tests to test the utf-8 support which was previously incomplete