-
Notifications
You must be signed in to change notification settings - Fork 1.8k
out_s3: Add ZSTD Compression Support #10794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds Zstandard (zstd) as a compression option across AWS integrations: defines Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client as Records
participant AWSComp as flb_aws_compress
participant ZSTD as flb_zstd
participant S3 as out_s3 plugin
participant S3API as Amazon S3
Note over S3: config compression = zstd
Client->>S3: send(records)
S3->>AWSComp: compress(records, type=ZSTD)
AWSComp->>ZSTD: flb_zstd_compress(data)
ZSTD-->>AWSComp: compressed_data
AWSComp-->>S3: compressed_data
S3->>S3: get_content_encoding_header(ZSTD) -> "zstd"
S3->>S3API: PutObject(body=compressed_data, Content-Encoding: zstd)
S3API-->>S3: 200 OK
rect rgba(230,245,255,0.5)
Note over AWSComp,S3: New/changed flow: zstd compression + Content-Encoding selection
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
This patch adds zstd based compression support to out_s3 plugin. This references the new aws compression support for zstd using the flb_zstd compression mechanism. It also adds a new content header for zstd compression Signed-off-by: Ujjwal Sharma <[email protected]> Signed-off-by: usharma <[email protected]>
zstd compression type added for the aws compression Signed-off-by: Ujjwal Sharma <[email protected]> Signed-off-by: usharma <[email protected]>
This patch adds zstd compression to existing compression options for aws. It references the compression login in flb_zstd Signed-off-by: Ujjwal Sharma <[email protected]> Signed-off-by: usharma <[email protected]>
zstd basic test cases added for compression detection and decoding validation Signed-off-by: Ujjwal Sharma <[email protected]> Signed-off-by: usharma <[email protected]>
Signed-off-by: usharma <[email protected]>
Implemented a struct switch to select encoding header based on compression method. This can be extended in the future. Currently gzip was hardcoded. Signed-off-by: usharma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/out_s3/s3.c (1)
1384-1402
: Fix: header/body mismatch when compression fails in put_all_chunksOn compression failure, you upload uncompressed data but still set
Content-Encoding
based on ctx->compression, causing clients to misinterpret the object. Temporarily suppress the header when the upload is uncompressed.@@ - void *payload_buf = NULL; + void *payload_buf = NULL; size_t payload_size = 0; @@ - if (ret == -1) { + if (ret == -1) { flb_plg_error(ctx->ins, "Failed to compress data, uploading uncompressed data instead to prevent data loss"); } else { flb_plg_info(ctx->ins, "Pre-compression chunk size is %zu, After compression, chunk is %zu bytes", buffer_size, payload_size); flb_free(buffer); buffer = (void *) payload_buf; buffer_size = payload_size; } } @@ - ret = s3_put_object(ctx, (const char *) - fsf->meta_buf, - chunk->create_time, buffer, buffer_size); + /* Ensure headers reflect what is actually sent */ + { + int saved_compression = ctx->compression; + if (saved_compression != FLB_AWS_COMPRESS_NONE && payload_buf == NULL) { + /* compression failed above, sending uncompressed */ + ctx->compression = FLB_AWS_COMPRESS_NONE; + } + ret = s3_put_object(ctx, (const char *) + fsf->meta_buf, + chunk->create_time, buffer, buffer_size); + ctx->compression = saved_compression; + }Note: Using a local override avoids a larger signature change to
create_headers
/s3_put_object
while keeping behavior correct.Also applies to: 1398-1402
🧹 Nitpick comments (3)
tests/internal/aws_compress.c (1)
60-74
: Make zstd compression test conditional; consider brittleness of golden output
- Guard the whole test with
FLB_HAVE_ZSTD
.- Golden compressed bytes can differ across zstd versions. If flakes arise, switch to “compress then decompress equals input” instead of comparing a base64 golden.
-void test_compression_zstd() +#ifdef FLB_HAVE_ZSTD +void test_compression_zstd() { struct flb_aws_test_case cases[] = { { "zstd", "hello hello hello hello hello hello", "KLUv/SAjZQAAMGhlbGxvIAEAuUsR", 0 }, { 0 } }; flb_aws_compress_test_cases(cases); } +#endifIf you want me to provide a non-golden variant of this test to avoid version drift, say the word and I'll draft it.
plugins/out_s3/s3.c (2)
88-113
: Content-Encoding helper is solid; small improvementNice consolidation. Consider making the headers
static const
to place them in .rodata and prevent unintended writes.-static struct flb_aws_header gzip_header = { +static const struct flb_aws_header gzip_header = { ... -static struct flb_aws_header zstd_header = { +static const struct flb_aws_header zstd_header = {
4028-4033
: Clarify build-time availability of zstd in config helpUsers on builds without Zstd will see “unknown compression: zstd”. Add a note that zstd requires Zstandard support at build time; also suggest using a .zst suffix in s3_key_format.
- "Compression type for S3 objects. 'gzip', 'arrow', 'parquet' and 'zstd' are the supported values. " + "Compression type for S3 objects. Supported: 'gzip', 'arrow', 'parquet', 'zstd'. " + "Note: 'zstd' is available only if Fluent Bit was built with Zstandard support. " @@ - "If 'zstd' is selected, the Content-Encoding HTTP Header will be set to 'zstd'." + "If 'zstd' is selected, the Content-Encoding HTTP Header will be set to 'zstd'. " + "When using 'zstd', consider adding a '.zst' suffix to s3_key_format for clarity."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
include/fluent-bit/aws/flb_aws_compress.h
(1 hunks)plugins/out_s3/s3.c
(6 hunks)src/aws/flb_aws_compress.c
(2 hunks)tests/internal/aws_compress.c
(6 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/aws/flb_aws_compress.c (1)
src/flb_zstd.c (1)
flb_zstd_compress
(30-54)
tests/internal/aws_compress.c (1)
src/flb_zstd.c (1)
flb_zstd_uncompress
(130-163)
🔇 Additional comments (1)
include/fluent-bit/aws/flb_aws_compress.h (1)
28-28
: ZSTD compression type addition looks goodNo ABI/API concerns from adding
FLB_AWS_COMPRESS_ZSTD 4
.
We need to free the s3 headers from memory in case compression not present Signed-off-by: usharma <[email protected]>
Signed-off-by: usharma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
plugins/out_s3/s3.c (1)
215-225
: Early-return leak fix acknowledgedFreeing s3_headers on NULL encoding_header resolves the leak noted earlier.
🧹 Nitpick comments (3)
plugins/out_s3/s3.c (3)
87-111
: Return a const header pointerThese headers are immutable; make the function return type const to reflect that and tighten the API.
-static struct flb_aws_header *get_content_encoding_header(int compression_type) +static const struct flb_aws_header *get_content_encoding_header(int compression_type)
181-181
: Make local pointer const to match new function signatureMinor const-correctness.
- struct flb_aws_header *encoding_header = NULL; + const struct flb_aws_header *encoding_header = NULL;
1205-1210
: Rename preCompress_size for consistencyUse snake_case to match the file’s predominant style.
- size_t preCompress_size = 0; + size_t pre_compress_size = 0; @@ - preCompress_size = body_size; + pre_compress_size = body_size; @@ - flb_plg_info(ctx->ins, "Pre-compression upload_chunk_size= %zu, After compression, chunk is only %zu bytes, " - "the chunk was too small, using PutObject to upload", preCompress_size, body_size); + flb_plg_info(ctx->ins, "Pre-compression upload_chunk_size= %zu, After compression, chunk is only %zu bytes, " + "the chunk was too small, using PutObject to upload", pre_compress_size, body_size);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
plugins/out_s3/s3.c
(5 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.125Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.819Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.540Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.771Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.
📚 Learning: 2025-08-29T06:25:27.222Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Applied to files:
plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:25:02.540Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.540Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Applied to files:
plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:24:26.125Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.125Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Applied to files:
plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:24:55.819Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.819Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Applied to files:
plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:25:27.222Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Applied to files:
plugins/out_s3/s3.c
🧬 Code graph analysis (1)
plugins/out_s3/s3.c (1)
include/fluent-bit/flb_mem.h (1)
flb_free
(126-128)
🔇 Additional comments (3)
plugins/out_s3/s3.c (3)
185-187
: Header count condition updated for zstd — looks good
4028-4033
: Config help text correctly documents zstd addition — looks good
87-111
: Verify CreateMultipartUpload sets Content-Encoding for gzip/zstdEnsure the multipart upload path (create_multipart_upload → create_headers(..., FLB_TRUE)) includes the same Content-Encoding header logic used in PutObject when ctx->compression is FLB_AWS_COMPRESS_GZIP or FLB_AWS_COMPRESS_ZSTD.
Signed-off-by: usharma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't notice macOS CI tasks are not suceeded yet. So, we need to resolve before merging.
This PR adds ZSTD compression Support to out_s3 plugin and aws_compression to be used by other plugins. As of now only gzip compression is supported
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
flb_log.txt
valgrind-report.txt
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Additional Note on some Load Testing
I am seeing some pretty great results with ZSTD as compared to GZIP. CPU drop by 60% and Memory by 80%.
Attaching some screenshots and Test details
Test Scenario
1 Pod, 5000 Logs per second for Tail Plugin with Each log line 2000 chars ie. load is 10 MBPS on 1 Fluentbit Pod.