-
Notifications
You must be signed in to change notification settings - Fork 304
Description
Describe the issue
On a default installation of the fluent-operator with Fluent Bit enabled, the kubernetes
ClusterFilter
CR for fluentbit.fluent.io
contains the following default filter rules to remove log fields:
- modify:
rules:
- remove: stream
- remove: kubernetes_pod_id
- remove: kubernetes_host
- remove: kubernetes_container_hash
All is well, but I decided that I wanted to also hide kubernetes_pod_ip
and kubernetes_docker_id
fields from OpenSearch indices, so I applied a JSON patch to the CR:
- modify:
rules:
- remove: stream
- remove: kubernetes_host
- remove: kubernetes_pod_ip
- remove: kubernetes_pod_id
- remove: kubernetes_docker_id
- remove: kubernetes_container_hash
Which triggered all the fluent-bit
pods to reload their config as designed:
[ info] [input:tail:tail.1] inotify_fs_remove(): inode=1326224 watch_fd=68
[ info] [input:tail:tail.1] inotify_fs_remove(): inode=1622065 watch_fd=69
[ info] [reload] start everything
[ info] [fluent bit] version=3.2.5, commit=69ab1c11a1, pid=12
[ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[ info] [simd ] disabled
[ info] [cmetrics] version=0.9.9
[ info] [ctraces ] version=0.5.7
[ info] [input:systemd:systemd.0] initializing
[ info] [input:systemd:systemd.0] storage_strategy='memory' (memory only)
[ warn] [input:systemd:systemd.0] seek_cursor failed
[ info] [input:tail:tail.1] initializing
[ info] [input:tail:tail.1] storage_strategy='memory' (memory only)
[ info] [input:tail:tail.1] db: delete unmonitored stale inodes from the database: count=0
[ info] [filter:kubernetes:kubernetes.1] https=1 host=kubernetes.default.svc port=443
[ info] [filter:kubernetes:kubernetes.1] token updated
[ info] [filter:kubernetes:kubernetes.1] local POD info OK
[ info] [filter:kubernetes:kubernetes.1] testing connectivity with API server...
[ info] [filter:kubernetes:kubernetes.1] connectivity OK
[ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[ info] [sp] stream processor started
Almost immediately after that, searches in OpenSearch yield records mostly devoid of those two extra fields.
I waited for 5 minutes or so, and with many services in many namespaces generating logs, I only see logs from the kube-system
namespace that still contain kubernetes_pod_ip
and kubernetes_docker_id
. For example:
kube-system rke2-canal-spzf5 calico-node 192.168.0.171 0f5ca94a837bf57a5b4f1432efaa0f7e7ab6b36997520f7395dd9c1c2ee38c12
kube-system kube-controller-manager-k8s1 kube-controller-manager 192.168.0.171 321fa45d4a639ee41acac49cdf411217759b81e9d3611ed28ba7206c8e4320c0
I waited for another few minutes, and still only those select logs contain unwanted fields. Then I killed all the fluent-bit
DaemonSet pods to force a true restart. Within a few seconds, no logs contain those 2 fields as I would have expected to initially happen by themselves.
Is there some kind of configuration caching going on that prevents a soft reload of Fluent Bit from applying the desired configuration changes?
To Reproduce
Described behavior is reproducible. I deleted the FluentBit
CR and then helm uninstall
the fluent-operator
chart, and repeated the install and JSON patch to the ClusterFilter
CR as described above.
Expected behavior
All filter rule changes are applied atomically, give or take a few seconds.
Your Environment
- Fluent Operator version: 3.3.0
- Container Runtime: containerd
- Operating system: Ubuntu 24.04
- Kernel version: 6.8.0-55
How did you install fluent operator?
Helm install.
Additional context
No response