You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Currently when we define DiskCapacity and when we want to call /rebalance?rebalance_disk=true, it takes all these disks into account.
But it's always possible that a disk corrupts over time and gets excluded from kafka log dirs.
In clusters where all the configs are InfrastructureAsCode, we usually manage these in the environment and not in code. So in code we don't have direct visibility over which disks are excluded.
So we have a cluster that in code we defined DiskCapacity for all disks but in reality some disks are excluded. Now when we want to rebalance disks, it issues an error saying that e.g. disk1 exists in DiskCapacity but not in kafka logdirs, and thus we can't rebalanace the disks.
There are workarounds for this, like writing our own disk provisioner, or a script that handles this change in environment and removes that disk from CC capacity configs and also restart it.
But I believe it can be handled very easily by ignoring those disks that are not in kafka logdirs (or maybe just issues a warning). This way, we can define a brokerId=-1 capacity to reflect all brokers' capacity (like we do), and when we have a cluster that some disks are excluded from the normal/default state, CC also ignores them.
In another word, CC should take only those DiskCapacities into account that are in kafka's logdirs. If they're not, it should ignore it. So we define a default capacity and everything works as expected.
Thanks.
The text was updated successfully, but these errors were encountered:
Hi,
Currently when we define DiskCapacity and when we want to call
/rebalance?rebalance_disk=true
, it takes all these disks into account.But it's always possible that a disk corrupts over time and gets excluded from kafka log dirs.
In clusters where all the configs are InfrastructureAsCode, we usually manage these in the environment and not in code. So in code we don't have direct visibility over which disks are excluded.
So we have a cluster that in code we defined DiskCapacity for all disks but in reality some disks are excluded. Now when we want to rebalance disks, it issues an error saying that e.g. disk1 exists in DiskCapacity but not in kafka logdirs, and thus we can't rebalanace the disks.
There are workarounds for this, like writing our own disk provisioner, or a script that handles this change in environment and removes that disk from CC capacity configs and also restart it.
But I believe it can be handled very easily by ignoring those disks that are not in kafka logdirs (or maybe just issues a warning). This way, we can define a
brokerId=-1
capacity to reflect all brokers' capacity (like we do), and when we have a cluster that some disks are excluded from the normal/default state, CC also ignores them.In another word, CC should take only those DiskCapacities into account that are in kafka's logdirs. If they're not, it should ignore it. So we define a default capacity and everything works as expected.
Thanks.
The text was updated successfully, but these errors were encountered: