-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Kinesis Compression #17062
Labels
Comments
Looking forward to the PR. This will be a very useful capability. |
funguy-tech
pushed a commit
to funguy-tech/druid
that referenced
this issue
Sep 17, 2024
funguy-tech
pushed a commit
to funguy-tech/druid
that referenced
this issue
Sep 17, 2024
funguy-tech
pushed a commit
to funguy-tech/druid
that referenced
this issue
Sep 17, 2024
7 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Placeholder Feature Request for an upcoming PR.
This proposal is to bring support for common compression formats already implemented in Druid’s code base (zstd, gzip, etc) to Kinesis streams.
Compression would be exposed via an optional configuration parameter in the Kinesis ioConfig, ‘compressionFormat’, that when enabled will perform decompression of records at the point of record collection.
Motivation
Unlike Kafka, Kinesis by default does not offer much opportunity for compression out of the box. Because of this, it is a common usage pattern for Kinesis customers to compress/decompress their own data across the wire.
Given that Druid already has internal concepts for compression in various popular formats (zstd, gzip, etc), it would be useful for high throughput customers to have the ability to compress data across the wire.
Our team (a fleet of enterprise Druid clusters at petabyte scale) has seen Kinesis cost reduction to the tune of 50-80% by implementing a custom build of Druid with Kinesis decompression capabilities with little to no discernible impact on ingestion overhead.
PR forthcoming in a few days, but I wanted to open this feature request for community discussion.
The text was updated successfully, but these errors were encountered: