-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Support some compress functions #47307
base: master
Are you sure you want to change the base?
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and remember to format your file
|
||
Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments, | ||
uint32_t result, size_t input_rows_count) const override { | ||
// LOG(INFO) << "Executing FunctionCompress with " << input_rows_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these commented lines
col_data[idx] = '0', col_data[idx + 1] = 'x'; | ||
for (int i = 0; i < 4; i++) { | ||
unsigned char byte = (value >> (i * 8)) & 0xFF; | ||
col_data[idx + 2 + i * 2] = "0123456789ABCDEF"[byte >> 4]; // 高4位 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont use Chinese
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and make magic values
|
||
auto st = compression_codec->compress(data, &compressed_str); | ||
|
||
if (!st.ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment about when will it fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add cases like regression-test/suites/query_p0/sql_functions/test_template_one_arg.groovy
did
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont need modify this file anymore
std::string func_name = "compress"; | ||
InputTypeSet input_types = {TypeIndex::String}; | ||
|
||
// 压缩多个不同的字符串 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont use Chinese comment
std::string uncompressed; | ||
Slice data; | ||
Slice uncompressed_slice; | ||
for (int row = 0; row < input_rows_count; row++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use size_t
, not int
illegal = 1; | ||
} else { | ||
if (data[0] != '0' || data[1] != 'x') { | ||
LOG(INFO) << "illegal: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont log info here
if (x >= 'A' && x <= 'F') return true; | ||
return false; | ||
}; | ||
auto trans = [](char x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just use from_chars
and to_chars
to replace your user implemented lambdas
// Print the compressed string (after compression) | ||
// LOG(INFO) << "Compressed string at row " << row << ": " | ||
// << std::string(reinterpret_cast<const char*>(col_data.data())); | ||
col_offset[row] = col_offset[row - 1] + 10 + compressed_str.size() * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this value for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first ten digits of the compress value are "0x" and eight digits long, followed by each digit split into two hexadecimal values
fa80a74
to
ca2b27e
Compare
What problem does this PR solve?
Added the compress and uncompressed functions similar to mysql
Issue Number: close #45530
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)