Skip to content

Conversation

shadowshot-x
Copy link

@shadowshot-x shadowshot-x commented Aug 28, 2025

This PR adds ZSTD compression Support to out_s3 plugin and aws_compression to be used by other plugins. As of now only gzip compression is supported


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[SERVICE]
    flush        1
    daemon       Off
    log_level    debug
    http_server  Off
    http_listen  0.0.0.0
    http_port    2020

[INPUT]
        Name dummy
        Tag dummy.data
        Dummy {"this is":"dummy data"}
        Rate 1

[OUTPUT]
        Name                         s3
        Match                        dummy.data
        bucket                       log-router-dev-us-west-2
        region                       us-west-2
        total_file_size              60M
        retry_limit                  10
        upload_timeout               20s
        log_level                    debug
        net.keepalive_idle_timeout   60
        tls.verify                   False
        compression                  zstd
        use_put_object               True
        store_dir_limit_size         5G
        role_arn                     arn:aws:iam::654907767108:role/oil-publisher-role-irsa
        s3_key_format                /test/zstd3/%H%M%S-$UUID.zst
        workers                      1
  • Attached Valgrind output that shows no leaks or memory corruption was found

valgrind-report.txt

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features
    • Added Zstandard (zstd) compression option for S3 uploads; uploads set Content-Encoding: zstd when selected.
  • Refactor
    • Unified and hardened Content-Encoding handling for compressed uploads, ensuring consistent behavior for gzip and zstd and improved error handling.
  • Documentation
    • Updated configuration help to list zstd as a supported compression type.
  • Tests
    • Added tests covering zstd compression and truncated base64 decode paths.

Additional Note on some Load Testing

I am seeing some pretty great results with ZSTD as compared to GZIP. CPU drop by 60% and Memory by 80%.

Attaching some screenshots and Test details

Test Scenario
1 Pod, 5000 Logs per second for Tail Plugin with Each log line 2000 chars ie. load is 10 MBPS on 1 Fluentbit Pod.

Screenshot 2025-08-29 at 5 26 14 PM Screenshot 2025-08-29 at 5 26 20 PM

Copy link

coderabbitai bot commented Aug 28, 2025

Walkthrough

Adds Zstandard (zstd) as a compression option across AWS integrations: defines FLB_AWS_COMPRESS_ZSTD in the public header, registers zstd in the compression dispatcher, updates the S3 output to set Content-Encoding and pre-compression behavior for zstd, and extends tests to exercise zstd compression and truncated base64 decode paths.

Changes

Cohort / File(s) Summary
Public API: Compression Types
include/fluent-bit/aws/flb_aws_compress.h
Adds macro #define FLB_AWS_COMPRESS_ZSTD 4 to compression type definitions.
AWS Compression Core
src/aws/flb_aws_compress.c
Includes flb_zstd.h and registers "zstd"FLB_AWS_COMPRESS_ZSTD with handler flb_zstd_compress in the compression options table.
S3 Output Plugin
plugins/out_s3/s3.c
Includes flb_zstd.h; replaces static Content-Encoding header with get_content_encoding_header(int); create_headers allocates Content-Encoding for gzip/zstd with NULL checks; pre-compression branch now treats zstd like gzip; updates config_map docs to include zstd.
Tests: AWS Compression
tests/internal/aws_compress.c
Includes flb_zstd.h; adds test_compression_zstd() and test_b64_truncated_zstd() and a zstd decode wrapper; registers new tests to exercise zstd compress/uncompress and truncated base64 decode.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as Records
  participant AWSComp as flb_aws_compress
  participant ZSTD as flb_zstd
  participant S3 as out_s3 plugin
  participant S3API as Amazon S3

  Note over S3: config compression = zstd

  Client->>S3: send(records)
  S3->>AWSComp: compress(records, type=ZSTD)
  AWSComp->>ZSTD: flb_zstd_compress(data)
  ZSTD-->>AWSComp: compressed_data
  AWSComp-->>S3: compressed_data

  S3->>S3: get_content_encoding_header(ZSTD) -> "zstd"
  S3->>S3API: PutObject(body=compressed_data, Content-Encoding: zstd)
  S3API-->>S3: 200 OK

  rect rgba(230,245,255,0.5)
  Note over AWSComp,S3: New/changed flow: zstd compression + Content-Encoding selection
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • edsiper
  • koleini
  • fujimotos
  • PettitWesley
  • niedbalski

Poem

I nibbled code beneath the moonlit sod,
Zstd bits tucked cozy like a pod.
Headers hum "zstd" as uploads run,
Tests twitch whiskers — CI says "well done."
(\_/) — a happy rabbit hops, job done.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

usharma added 4 commits August 29, 2025 10:42
This patch adds zstd based compression support to out_s3 plugin. This references the new aws compression support for zstd using the flb_zstd compression mechanism. It also adds a new content header for zstd compression

Signed-off-by: Ujjwal Sharma <[email protected]>
Signed-off-by: usharma <[email protected]>
zstd compression type added for the aws compression

Signed-off-by: Ujjwal Sharma <[email protected]>
Signed-off-by: usharma <[email protected]>
This patch adds zstd compression to existing compression options for aws. It references the compression login in flb_zstd

Signed-off-by: Ujjwal Sharma <[email protected]>
Signed-off-by: usharma <[email protected]>
zstd basic test cases added for compression detection and decoding validation

Signed-off-by: Ujjwal Sharma <[email protected]>
Signed-off-by: usharma <[email protected]>
usharma added 2 commits August 29, 2025 11:33
Implemented a struct switch to select encoding header based on compression method. This can be extended in the future. Currently gzip was hardcoded.

Signed-off-by: usharma <[email protected]>
@shadowshot-x shadowshot-x marked this pull request as ready for review August 29, 2025 06:09
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/out_s3/s3.c (1)

1384-1402: Fix: header/body mismatch when compression fails in put_all_chunks

On compression failure, you upload uncompressed data but still set Content-Encoding based on ctx->compression, causing clients to misinterpret the object. Temporarily suppress the header when the upload is uncompressed.

@@
-    void *payload_buf = NULL;
+    void *payload_buf = NULL;
     size_t payload_size = 0;
@@
-            if (ret == -1) {
+            if (ret == -1) {
                 flb_plg_error(ctx->ins, "Failed to compress data, uploading uncompressed data instead to prevent data loss");
             } else {
                 flb_plg_info(ctx->ins, "Pre-compression chunk size is %zu, After compression, chunk is %zu bytes", buffer_size, payload_size);
                 flb_free(buffer);
 
                 buffer = (void *) payload_buf;
                 buffer_size = payload_size;
             }
         }
@@
-            ret = s3_put_object(ctx, (const char *)
-                                fsf->meta_buf,
-                                chunk->create_time, buffer, buffer_size);
+            /* Ensure headers reflect what is actually sent */
+            {
+                int saved_compression = ctx->compression;
+                if (saved_compression != FLB_AWS_COMPRESS_NONE && payload_buf == NULL) {
+                    /* compression failed above, sending uncompressed */
+                    ctx->compression = FLB_AWS_COMPRESS_NONE;
+                }
+                ret = s3_put_object(ctx, (const char *)
+                                    fsf->meta_buf,
+                                    chunk->create_time, buffer, buffer_size);
+                ctx->compression = saved_compression;
+            }

Note: Using a local override avoids a larger signature change to create_headers/s3_put_object while keeping behavior correct.

Also applies to: 1398-1402

🧹 Nitpick comments (3)
tests/internal/aws_compress.c (1)

60-74: Make zstd compression test conditional; consider brittleness of golden output

  • Guard the whole test with FLB_HAVE_ZSTD.
  • Golden compressed bytes can differ across zstd versions. If flakes arise, switch to “compress then decompress equals input” instead of comparing a base64 golden.
-void test_compression_zstd()
+#ifdef FLB_HAVE_ZSTD
+void test_compression_zstd()
 {
     struct flb_aws_test_case cases[] =
     {
         {
             "zstd",
             "hello hello hello hello hello hello",
             "KLUv/SAjZQAAMGhlbGxvIAEAuUsR",
             0
         },
         { 0 }
     };
 
     flb_aws_compress_test_cases(cases);
 }
+#endif

If you want me to provide a non-golden variant of this test to avoid version drift, say the word and I'll draft it.

plugins/out_s3/s3.c (2)

88-113: Content-Encoding helper is solid; small improvement

Nice consolidation. Consider making the headers static const to place them in .rodata and prevent unintended writes.

-static struct flb_aws_header gzip_header = {
+static const struct flb_aws_header gzip_header = {
...
-static struct flb_aws_header zstd_header = {
+static const struct flb_aws_header zstd_header = {

4028-4033: Clarify build-time availability of zstd in config help

Users on builds without Zstd will see “unknown compression: zstd”. Add a note that zstd requires Zstandard support at build time; also suggest using a .zst suffix in s3_key_format.

-    "Compression type for S3 objects. 'gzip', 'arrow', 'parquet' and 'zstd' are the supported values. "
+    "Compression type for S3 objects. Supported: 'gzip', 'arrow', 'parquet', 'zstd'. "
+    "Note: 'zstd' is available only if Fluent Bit was built with Zstandard support. "
@@
-    "If 'zstd' is selected, the Content-Encoding HTTP Header will be set to 'zstd'."
+    "If 'zstd' is selected, the Content-Encoding HTTP Header will be set to 'zstd'. "
+    "When using 'zstd', consider adding a '.zst' suffix to s3_key_format for clarity."
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7445e38 and 1ced9cf.

📒 Files selected for processing (4)
  • include/fluent-bit/aws/flb_aws_compress.h (1 hunks)
  • plugins/out_s3/s3.c (6 hunks)
  • src/aws/flb_aws_compress.c (2 hunks)
  • tests/internal/aws_compress.c (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/aws/flb_aws_compress.c (1)
src/flb_zstd.c (1)
  • flb_zstd_compress (30-54)
tests/internal/aws_compress.c (1)
src/flb_zstd.c (1)
  • flb_zstd_uncompress (130-163)
🔇 Additional comments (1)
include/fluent-bit/aws/flb_aws_compress.h (1)

28-28: ZSTD compression type addition looks good

No ABI/API concerns from adding FLB_AWS_COMPRESS_ZSTD 4.

usharma added 2 commits August 29, 2025 11:57
We need to free the s3 headers from memory in case compression not present

Signed-off-by: usharma <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
plugins/out_s3/s3.c (1)

215-225: Early-return leak fix acknowledged

Freeing s3_headers on NULL encoding_header resolves the leak noted earlier.

🧹 Nitpick comments (3)
plugins/out_s3/s3.c (3)

87-111: Return a const header pointer

These headers are immutable; make the function return type const to reflect that and tighten the API.

-static struct flb_aws_header *get_content_encoding_header(int compression_type)
+static const struct flb_aws_header *get_content_encoding_header(int compression_type)

181-181: Make local pointer const to match new function signature

Minor const-correctness.

-    struct flb_aws_header *encoding_header = NULL;
+    const struct flb_aws_header *encoding_header = NULL;

1205-1210: Rename preCompress_size for consistency

Use snake_case to match the file’s predominant style.

-    size_t preCompress_size = 0;
+    size_t pre_compress_size = 0;
@@
-            preCompress_size = body_size;
+            pre_compress_size = body_size;
@@
-                flb_plg_info(ctx->ins, "Pre-compression upload_chunk_size= %zu, After compression, chunk is only %zu bytes, "
-                                       "the chunk was too small, using PutObject to upload", preCompress_size, body_size);
+                flb_plg_info(ctx->ins, "Pre-compression upload_chunk_size= %zu, After compression, chunk is only %zu bytes, "
+                                       "the chunk was too small, using PutObject to upload", pre_compress_size, body_size);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9024ddd and 8461581.

📒 Files selected for processing (1)
  • plugins/out_s3/s3.c (5 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.125Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.819Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.540Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:26-26
Timestamp: 2025-08-29T06:24:44.771Z
Learning: In Fluent Bit, ZSTD support is always available and enabled by default. The build system automatically detects and uses either the system libzstd library or builds the bundled ZSTD version. Unlike other optional dependencies like Arrow which use conditional compilation guards (e.g., FLB_HAVE_ARROW), ZSTD does not require conditional includes or build flags.
📚 Learning: 2025-08-29T06:25:27.222Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components such as ARROW/PARQUET (which use `#ifdef FLB_HAVE_ARROW` guards), ZSTD support is always available and doesn't need build-time conditionals. ZSTD headers are included directly without guards across multiple plugins and core components.

Applied to files:

  • plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:25:02.540Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:7-7
Timestamp: 2025-08-29T06:25:02.540Z
Learning: In Fluent Bit, ZSTD (zstandard) compression library is bundled directly in the source tree at `lib/zstd-1.5.7` and is built unconditionally as a static library. Unlike optional external dependencies, ZSTD does not use conditional compilation guards like `FLB_HAVE_ZSTD` and is always available. Headers like `<fluent-bit/flb_zstd.h>` can be included directly without guards.

Applied to files:

  • plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:24:26.125Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:39-42
Timestamp: 2025-08-29T06:24:26.125Z
Learning: In Fluent Bit, ZSTD compression support is enabled by default and does not require conditional compilation guards (like #ifdef FLB_HAVE_ZSTD) around ZSTD-related code declarations and implementations.

Applied to files:

  • plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:24:55.819Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: src/aws/flb_aws_compress.c:52-56
Timestamp: 2025-08-29T06:24:55.819Z
Learning: ZSTD compression is always available in Fluent Bit and does not require conditional compilation guards. Unlike Arrow/Parquet which use #ifdef FLB_HAVE_ARROW guards, ZSTD is built unconditionally with flb_zstd.c included directly in src/CMakeLists.txt and a bundled ZSTD library at lib/zstd-1.5.7/.

Applied to files:

  • plugins/out_s3/s3.c
📚 Learning: 2025-08-29T06:25:27.222Z
Learnt from: shadowshot-x
PR: fluent/fluent-bit#10794
File: tests/internal/aws_compress.c:93-107
Timestamp: 2025-08-29T06:25:27.222Z
Learning: In Fluent Bit, ZSTD compression is enabled by default and is treated as a core dependency, not requiring conditional compilation guards like `#ifdef FLB_HAVE_ZSTD`. Unlike some other optional components, ZSTD support is always available and doesn't need build-time conditionals.

Applied to files:

  • plugins/out_s3/s3.c
🧬 Code graph analysis (1)
plugins/out_s3/s3.c (1)
include/fluent-bit/flb_mem.h (1)
  • flb_free (126-128)
🔇 Additional comments (3)
plugins/out_s3/s3.c (3)

185-187: Header count condition updated for zstd — looks good


4028-4033: Config help text correctly documents zstd addition — looks good


87-111: Verify CreateMultipartUpload sets Content-Encoding for gzip/zstd

Ensure the multipart upload path (create_multipart_upload → create_headers(..., FLB_TRUE)) includes the same Content-Encoding header logic used in PutObject when ctx->compression is FLB_AWS_COMPRESS_GZIP or FLB_AWS_COMPRESS_ZSTD.

Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't notice macOS CI tasks are not suceeded yet. So, we need to resolve before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants