-
Notifications
You must be signed in to change notification settings - Fork 1.8k
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
config: multiline: in_tail: filter_multiline: Add configurable buffer limit for multiline interface #10653
Conversation
578f46a
to
3a8abd9
Compare
WalkthroughAdds a configurable multiline buffer limit and binary-size parser; propagates a FLB_MULTILINE_TRUNCATED status through multiline processing, enforces per-group buffer limits on append, records truncation via metrics/logs (filter and tail), extends multiline parser creation with a params API, and adds tests for truncation and binary-size parsing. Changes
Sequence Diagram(s)sequenceDiagram
participant Config
participant Tail as TailInput
participant ML as MultilineCore
participant Group as StreamGroup
participant Filter
participant Metrics
Config->>ML: init buffer_limit (string -> bytes)
Tail->>ML: flb_ml_append_text/append_object(data)
ML->>Group: flb_ml_group_cat(data,len)
alt appended fully
Group-->>ML: FLB_MULTILINE_OK
ML->>Filter: emit/process event
Filter->>Metrics: inc emitted metric
else truncated or partially appended
Group-->>ML: FLB_MULTILINE_TRUNCATED
ML->>Tail: log warning
ML->>Metrics: inc truncated metric
ML->>Group: mark truncated flag
end
ML->>Group: flush -> include "multiline_truncated": true if set
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
3a8abd9
to
dec20f1
Compare
/* Return codes */ | ||
#define FLB_MULTILINE_OK 0 | ||
#define FLB_MULTILINE_PROCESSED 1 /* Reserved */ | ||
#define FLB_MULTILINE_TRUNCATED 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing to status code 2 is needed because status code 1 will be collided for FLB_TRUE status.
Signed-off-by: Hiroshi Hatake <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/multiline/flb_ml_parser.c (1)
124-128
: Deferred registry linkage is correct.Addresses earlier feedback about adding to the list only after successful init.
🧹 Nitpick comments (2)
tests/internal/multiline.c (1)
114-115
: Updated container_mix expectations — add a brief note for future readers."bbccdd-out\n" spans stdout-only concatenation across multiple records; a short comment here will avoid confusion about why dd-out is appended to the earlier stdout chunk while stderr pieces are separated.
src/multiline/flb_ml_parser.c (1)
31-45
: Sane defaults helper LGTM.Small nit: consider making the
name
memberconst char *
in the params struct to avoid the cast.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/multiline/flb_ml_parser.c
(3 hunks)tests/internal/multiline.c
(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/internal/multiline.c (6)
src/flb_config.c (2)
flb_config_init
(216-421)flb_config_exit
(423-594)src/flb_parser.c (1)
flb_parser_get
(1022-1042)src/multiline/flb_ml.c (3)
flb_ml_create
(868-920)flb_ml_append_text
(664-754)flb_ml_destroy
(981-1006)src/multiline/flb_ml_parser.c (4)
flb_ml_parser_params_default
(32-44)flb_ml_parser_create_params
(47-128)flb_ml_parser_init
(130-140)flb_ml_parser_instance_create
(260-311)src/multiline/flb_ml_rule.c (1)
flb_ml_rule_create
(48-115)src/multiline/flb_ml_stream.c (1)
flb_ml_stream_create
(223-276)
src/multiline/flb_ml_parser.c (2)
include/fluent-bit/flb_mem.h (2)
flb_calloc
(84-96)flb_free
(126-128)src/flb_sds.c (2)
flb_sds_create
(78-90)flb_sds_destroy
(389-399)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-centos-7
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (9)
tests/internal/multiline.c (7)
395-398
: Defensive early-return in flush callback is correct.Prevents NULL deref when tests pass no expected-result context.
1481-1484
: OK to set buffer limit via string literal.This exercises the binary-size parser path; 80 bytes is a good minimal boundary.
1493-1503
: Params-based initializer usage looks good.Using key_content="log" with docker parser context matches the JSON-extraction intent.
1517-1519
: Passing NULL to flush callback now safe after earlier guard.Matches the intent to test return codes only.
1523-1531
: Return-code assertions cover both OK and TRUNCATED paths.Nice, this verifies the new truncation code path end-to-end.
1548-1548
: Good addition to TEST_LIST.Keeps the new path exercised in CI.
1533-1535
: Verify thatflb_config_exit()
invokesflb_ml_exit(config)
to clean up the parser registryRun:
rg -nP 'flb_ml_exit' -C3 src/flb_config.c
src/multiline/flb_ml_parser.c (2)
23-30
: Includes are appropriate for the new params API.
198-223
: Thin wrapper to params keeps legacy API intact.Straightforward pass-through without behavior changes.
Signed-off-by: Hiroshi Hatake <[email protected]>
…n parser Signed-off-by: Hiroshi Hatake <[email protected]>
5278995
to
791d9e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/internal/multiline.c (2)
395-398
: Move the NULL-guard to the top of flush_callback to avoid noisy printsShort-circuit before printing/decoding when tests pass NULL to focus only on status. This reduces log noise and cycles.
Apply this diff within the changed hunk to remove the late guard:
- if (!res) { - return 0; - }And add the guard right after the local is set (outside the changed hunk), e.g.:
static int flush_callback(struct flb_ml_parser *parser, struct flb_ml_stream *mst, void *data, char *buf_data, size_t buf_size) { struct expected_result *res = data; if (!res) { return 0; } /* ... existing prints and validation ... */ }
1463-1532
: Solid truncation status test; tighten a couple of details
- Remove unused variable to keep warnings clean.
- Optional: either adjust the comment to reflect that this test concatenates raw text (JSON strings), or switch to object ingestion if you want to assert that the limit applies specifically to key_content="log".
- Optional: assert the resolved buffer limit to catch config parsing regressions.
Minimal diffs:
- Drop the unused declaration.
- struct flb_parser *p;
- Keep the comment accurate (if staying with text ingestion):
- /* - * A realistic Docker log where the content of the "log" field will be - * concatenated, and that concatenated buffer is what should be truncated. - */ + /* + * We append JSON-formatted text lines; the test exercises truncation on the + * concatenated text buffer (not parsing the "log" field here). + */
- Assert the parsed limit (right after flb_ml_create):
ml = flb_ml_create(config, "limit-test"); TEST_CHECK(ml != NULL); + TEST_CHECK(ml->buffer_limit == 80);
Option B (if you prefer key_content-aware validation): pack line1/line2 into msgpack maps {"log": ..., "stream": ...} and use flb_ml_append_object() instead of flb_ml_append_text(). I can provide a compact patch if you want to go that route.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/internal/multiline.c
(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/internal/multiline.c (5)
src/flb_config.c (2)
flb_config_init
(216-421)flb_config_exit
(423-594)src/multiline/flb_ml.c (3)
flb_ml_create
(868-920)flb_ml_append_text
(664-754)flb_ml_destroy
(981-1006)src/multiline/flb_ml_parser.c (4)
flb_ml_parser_params_default
(32-44)flb_ml_parser_create_params
(47-129)flb_ml_parser_init
(131-141)flb_ml_parser_instance_create
(261-312)src/multiline/flb_ml_rule.c (1)
flb_ml_rule_create
(48-115)src/multiline/flb_ml_stream.c (1)
flb_ml_stream_create
(223-276)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
- GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
- GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
- GitHub Check: pr-compile-centos-7
- GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
- GitHub Check: PR - fuzzing test
🔇 Additional comments (2)
tests/internal/multiline.c (2)
114-115
: Updated expectations for interleaved container streams look correctThe combined stdout record now accumulating "bbcc" with the later "dd-out" and the stderr record as "dd-err" aligns with per-stream multiline state surviving interleaved CRI entries. No action needed.
1545-1545
: Test registration LGTMNew test is properly added to TEST_LIST.
We added an interface for configurable buffer limit for multiline.
Also, we implemented robust processing for multiline concatenations.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Bug Fixes
Tests