Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing scheduling plan #2479

Merged
merged 1 commit into from
Jan 15, 2025
Merged

Removing scheduling plan #2479

merged 1 commit into from
Jan 15, 2025

Conversation

muhamadazmy
Copy link
Contributor

@muhamadazmy muhamadazmy commented Jan 10, 2025

Removing scheduling plan

Summary:

  • Completely remote scheduling plan
  • Schedular now updates the replication group of each partition directly in the partition table
  • PartitionRouting uses partition table replication group for routing decisions
  • The changeset does not change the behaviour of the schedular, or scheduling algorithms.

@muhamadazmy muhamadazmy force-pushed the pr2479 branch 3 times, most recently from 003245a to 8c01d3d Compare January 10, 2025 10:03
@muhamadazmy muhamadazmy marked this pull request as ready for review January 10, 2025 11:27
@muhamadazmy muhamadazmy force-pushed the pr2479 branch 2 times, most recently from 10ad92e to 4d5c9ca Compare January 13, 2025 14:23
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for removing the SchedulingPlan @muhamadazmy. The changes look good to me :-) I left a few minor comments. +1 for merging after resolving them.

Comment on lines 164 to 173
Some(cmd) = self.receiver.recv() => {
match cmd {
Command::SyncRoutingInformation => {
self.spawn_sync_routing_information_task();
self.spawn_sync_routing_information_task(*partition_table_watch.borrow());
}
}
}
_ = update_interval.tick() => {
_ = partition_table_watch.changed() => {
trace!("Refreshing routing information...");
self.spawn_sync_routing_information_task();
self.spawn_sync_routing_information_task(*partition_table_watch.borrow());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If partition_table_watch tells us that a given partition table version is present, I think we can just call Metadata::current(|m| m.partition_table_ref()) to get access to it. Therefore I think that spawning an additional task shouldn't be necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally missed that. In my head I thought we still have to wait for the version to sync not that the metadata is already up to date. Thank you for pointing this out

Comment on lines 77 to 78
metadata_store_client: MetadataStoreClient,
metadata_writer: MetadataWriter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata_writer should give you access to the metadata_store_client. So we could remove the metadata_store_client.

},
Ok(_) => {}
Err(WriteError::FailedPrecondition(msg)) => {
debug!("Partition table update failed due to: {msg}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to sync the PartitionTable here in order to not wait for the regular update interval to happen? We do know that there must be a newer version available because of the FailedPrecondition.


if target_state.leader.is_none() {
target_state.leader = self.select_leader_from(target_state, preferred_leader);
// check whether we modified the leader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems no longer accurate.

target_state.leader = self.select_leader_from(target_state, preferred_leader);
// check whether we modified the leader
} else if preferred_leader.is_some_and(|preferred_leader| {
Some(preferred_leader) != target_state.leader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this check here Some(preferred_leader) != target_state.leader should no longer be needed because we don't return a value from this branch anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get what you mean. The target_state.leader can still be set to None by the ensure_replication function. This can happen for example if the leader node died the ensure_replication will make sure the nodeset has an updated set but will not choose a leader. The leader will then be setup here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this check we only assign the preferred leader if it is different from target_state.leader. W/o it we would assign it uncoditionally of this aspect (simplifying the if condition a bit).

let target_scheduling_plan = metadata_store_client
.get::<SchedulingPlan>(SCHEDULING_PLAN_KEY.clone())
let target_partition_table = metadata_store_client
.get::<PartitionTable>(PARTITION_TABLE_KEY.clone())
.await?
.expect("the scheduler should have created a scheduling plan");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: expect message seems outdated.

@muhamadazmy muhamadazmy force-pushed the pr2479 branch 7 times, most recently from 1230067 to 8c91a18 Compare January 14, 2025 16:02
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. +1 for merging :-)

Comment on lines 523 to 525
// pub fn iter(&self) -> impl Iterator<Item = &PlainNodeId> {
// self.node_set.iter()
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely. Sorry about that 👍🏼

Summary:
- Completely remote scheduling plan
- Schedular now updates the replication group of each partition directly in the partition table
- PartitionRouting uses partition table replication group for routing decisions
- The changeset does not change the behaviour of the schedular, or scheduling algorithms.
@muhamadazmy muhamadazmy merged commit 9189e85 into restatedev:main Jan 15, 2025
24 checks passed
@muhamadazmy muhamadazmy deleted the pr2479 branch January 15, 2025 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants