-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for BYTEA/BLOB #511
Conversation
@JelteF currently the output from |
\x | ||
\x5c7831315c7830315c7830325c7830335c7830345c7830355c7830365c7830375c7830385c7830395c7830415c7830425c7830435c7830445c7830455c783046 | ||
\x | ||
\x5c783030 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently the output from select * from blob_tbl is an escaped version of inserted rows, is this fine or do we need to change that?
Okay, this needs to change. All the non empty bytea results are wrong. When using postgres execution the it instead gives the following rows:
\x
\x110102030405060708090a0b0c0d0e0f
\x
\x00
\x07
It seems like you're encoding the string as hex somewhere as an additional time, because 5c783030
is \x00
in ASCII, i.e. including the \
and x
characters. So I think when converting between from DuckDB to PG type you're encoding the string representation of the type instead of the raw bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it, converting it to string was trying to convert those bytes to string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed GetValue to GetValueUnsafe for getting raw bytes in string_t
\x07 | ||
|
||
(6 rows) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comparison test too? Something like:
SELECT * FROM blob_tbl WHERE a = '\x00';
auto str = value.GetValueUnsafe<duckdb::string_t>(); | ||
auto blob_len = str.GetSize(); | ||
auto blob = str.GetDataUnsafe(); | ||
bytea* result = (bytea *)palloc0(blob_len + VARHDRSZ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make format
will re-format this line as
bytea *result = (bytea *)palloc0(blob_len + VARHDRSZ);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the report. Fixed by: #518
In #496 we started using the newest clang-format to format our files, but CI was still installing the old version. This meant that we were not catching some unformatted files correctly. An example of this being: #511 (comment) This starts using the correct clang-format version in CI too and formats any incorrectly formatted files.
In #496 we started using the newest clang-format to format our files, but CI was still installing the old version. This meant that we were not catching some unformatted files correctly. An example of this being: #511 (comment) This starts using the correct clang-format version in CI too and formats any incorrectly formatted files.
In duckdb#496 we started using the newest clang-format to format our files, but CI was still installing the old version. This meant that we were not catching some unformatted files correctly. An example of this being: duckdb#511 (comment) This starts using the correct clang-format version in CI too and formats any incorrectly formatted files.
Fixes #464