Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore Disk Capacity when disk is not in Kafka's LogDir #2235

Open
imans777 opened this issue Jan 4, 2025 · 0 comments
Open

Ignore Disk Capacity when disk is not in Kafka's LogDir #2235

imans777 opened this issue Jan 4, 2025 · 0 comments

Comments

@imans777
Copy link

imans777 commented Jan 4, 2025

Hi,
Currently when we define DiskCapacity and when we want to call /rebalance?rebalance_disk=true, it takes all these disks into account.
But it's always possible that a disk corrupts over time and gets excluded from kafka log dirs.
In clusters where all the configs are InfrastructureAsCode, we usually manage these in the environment and not in code. So in code we don't have direct visibility over which disks are excluded.
So we have a cluster that in code we defined DiskCapacity for all disks but in reality some disks are excluded. Now when we want to rebalance disks, it issues an error saying that e.g. disk1 exists in DiskCapacity but not in kafka logdirs, and thus we can't rebalanace the disks.
There are workarounds for this, like writing our own disk provisioner, or a script that handles this change in environment and removes that disk from CC capacity configs and also restart it.
But I believe it can be handled very easily by ignoring those disks that are not in kafka logdirs (or maybe just issues a warning). This way, we can define a brokerId=-1 capacity to reflect all brokers' capacity (like we do), and when we have a cluster that some disks are excluded from the normal/default state, CC also ignores them.
In another word, CC should take only those DiskCapacities into account that are in kafka's logdirs. If they're not, it should ignore it. So we define a default capacity and everything works as expected.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant