-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing scheduling plan #2479
Removing scheduling plan #2479
Conversation
003245a
to
8c01d3d
Compare
10ad92e
to
4d5c9ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for removing the SchedulingPlan
@muhamadazmy. The changes look good to me :-) I left a few minor comments. +1 for merging after resolving them.
crates/core/src/partitions.rs
Outdated
Some(cmd) = self.receiver.recv() => { | ||
match cmd { | ||
Command::SyncRoutingInformation => { | ||
self.spawn_sync_routing_information_task(); | ||
self.spawn_sync_routing_information_task(*partition_table_watch.borrow()); | ||
} | ||
} | ||
} | ||
_ = update_interval.tick() => { | ||
_ = partition_table_watch.changed() => { | ||
trace!("Refreshing routing information..."); | ||
self.spawn_sync_routing_information_task(); | ||
self.spawn_sync_routing_information_task(*partition_table_watch.borrow()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If partition_table_watch
tells us that a given partition table version is present, I think we can just call Metadata::current(|m| m.partition_table_ref())
to get access to it. Therefore I think that spawning an additional task shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally missed that. In my head I thought we still have to wait for the version to sync not that the metadata is already up to date. Thank you for pointing this out
metadata_store_client: MetadataStoreClient, | ||
metadata_writer: MetadataWriter, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata_writer
should give you access to the metadata_store_client
. So we could remove the metadata_store_client
.
}, | ||
Ok(_) => {} | ||
Err(WriteError::FailedPrecondition(msg)) => { | ||
debug!("Partition table update failed due to: {msg}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to sync the PartitionTable
here in order to not wait for the regular update interval to happen? We do know that there must be a newer version available because of the FailedPrecondition
.
|
||
if target_state.leader.is_none() { | ||
target_state.leader = self.select_leader_from(target_state, preferred_leader); | ||
// check whether we modified the leader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems no longer accurate.
target_state.leader = self.select_leader_from(target_state, preferred_leader); | ||
// check whether we modified the leader | ||
} else if preferred_leader.is_some_and(|preferred_leader| { | ||
Some(preferred_leader) != target_state.leader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this check here Some(preferred_leader) != target_state.leader
should no longer be needed because we don't return a value from this branch anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get what you mean. The target_state.leader can still be set to None
by the ensure_replication
function. This can happen for example if the leader node died the ensure_replication will make sure the nodeset has an updated set but will not choose a leader. The leader will then be setup here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this check we only assign the preferred leader if it is different from target_state.leader
. W/o it we would assign it uncoditionally of this aspect (simplifying the if condition a bit).
let target_scheduling_plan = metadata_store_client | ||
.get::<SchedulingPlan>(SCHEDULING_PLAN_KEY.clone()) | ||
let target_partition_table = metadata_store_client | ||
.get::<PartitionTable>(PARTITION_TABLE_KEY.clone()) | ||
.await? | ||
.expect("the scheduler should have created a scheduling plan"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: expect message seems outdated.
1230067
to
8c91a18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. +1 for merging :-)
// pub fn iter(&self) -> impl Iterator<Item = &PlainNodeId> { | ||
// self.node_set.iter() | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitely. Sorry about that 👍🏼
Summary: - Completely remote scheduling plan - Schedular now updates the replication group of each partition directly in the partition table - PartitionRouting uses partition table replication group for routing decisions - The changeset does not change the behaviour of the schedular, or scheduling algorithms.
Removing scheduling plan
Summary: