-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[ENH] Cut out the go log service. (#5369) #5411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Remove Go Log Service, Finalize Full Cutover to Rust Log Service This PR completes the deprecation and removal of the Go log service, switching all log-related responsibilities entirely to the Rust log service. All RPC handlers in the Go log server now return explicit errors indicating their deprecation, and the Rust log service is responsible for log initialization and all operations. Test files, configs, and supporting code paths connected to Go log logic are removed or updated, ensuring clean migration and preventing accidental use of the old Go code paths. Key Changes• All log Affected Areas• go/pkg/log/server This summary was automatically generated by @propel-code-bot |
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
if let Err(wal3::Error::UninitializedLog) = res { | ||
return self | ||
.forward_update_collection_log_offset(Request::new(request)) | ||
.await; | ||
return Err(Status::not_found(format!( | ||
"collection {collection_id} not found" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Error handling inconsistency: The function now returns Err(Status::not_found(...))
for uninitialized logs, but callers may expect different error semantics. This breaking change could cause issues if other parts of the system rely on the previous forwarding behavior.
Consider:
// Add clear documentation about the new behavior
// Or provide a migration path for existing callers
if let Err(wal3::Error::UninitializedLog) = res {
tracing::warn!("Collection {} not found in rust log service", collection_id);
return Err(Status::not_found(format!(
"collection {collection_id} not found"
)));
}
Context for Agents
[**BestPractice**]
**Error handling inconsistency**: The function now returns `Err(Status::not_found(...))` for uninitialized logs, but callers may expect different error semantics. This breaking change could cause issues if other parts of the system rely on the previous forwarding behavior.
Consider:
```rust
// Add clear documentation about the new behavior
// Or provide a migration path for existing callers
if let Err(wal3::Error::UninitializedLog) = res {
tracing::warn!("Collection {} not found in rust log service", collection_id);
return Err(Status::not_found(format!(
"collection {collection_id} not found"
)));
}
```
File: rust/log-service/src/lib.rs
Line: 617
Tiltfile
Outdated
@@ -259,7 +259,6 @@ k8s_resource('postgres', resource_deps=['k8s_setup'], labels=["infrastructure"], | |||
# Jobs are suffixed with the image tag to ensure they are unique. In this context, the image tag is defined in k8s/distributed-chroma/values.yaml. | |||
k8s_resource('sysdb-migration-latest', resource_deps=['postgres'], labels=["infrastructure"]) | |||
k8s_resource('logservice-migration-latest', resource_deps=['postgres'], labels=["infrastructure"]) | |||
k8s_resource('logservice', resource_deps=['sysdb-migration-latest'], labels=["chroma"], port_forwards='50052:50051') | |||
k8s_resource('rust-log-service', labels=["chroma"], port_forwards='50054:50051') | |||
k8s_resource('sysdb', resource_deps=['sysdb-migration-latest'], labels=["chroma"], port_forwards='50051:50051') | |||
k8s_resource('rust-frontend-service', resource_deps=['sysdb', 'logservice', 'rust-log-service'], labels=["chroma"], port_forwards='8000:8000') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
The logservice
resource is removed in this PR, but it's still listed as a dependency for rust-frontend-service
. This will likely cause Tilt to fail. The dependency should be removed to prevent this.
k8s_resource('rust-frontend-service', resource_deps=['sysdb', 'logservice', 'rust-log-service'], labels=["chroma"], port_forwards='8000:8000') | |
k8s_resource('rust-frontend-service', resource_deps=['sysdb', 'rust-log-service'], labels=["chroma"], port_forwards='8000:8000') |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
Context for Agents
[**CriticalError**]
The `logservice` resource is removed in this PR, but it's still listed as a dependency for `rust-frontend-service`. This will likely cause Tilt to fail. The dependency should be removed to prevent this.
```suggestion
k8s_resource('rust-frontend-service', resource_deps=['sysdb', 'rust-log-service'], labels=["chroma"], port_forwards='8000:8000')
```
⚡ **Committable suggestion**
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
File: Tiltfile
Line: 264
Err(wal3::Error::UninitializedLog) => { | ||
return self.forward_pull_logs(Request::new(pull_logs)).await; | ||
} | ||
Err(wal3::Error::UninitializedLog) => vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
This change in error handling from forwarding to proxy to returning empty results could mask legitimate errors. Consider whether returning an empty vector for UninitializedLog
is the correct behavior, as it might make it difficult for clients to distinguish between "no records exist" and "collection doesn't exist".
// Consider returning a more explicit error or status
Err(wal3::Error::UninitializedLog) => vec![], // This might hide collection existence issues
Context for Agents
[**BestPractice**]
This change in error handling from forwarding to proxy to returning empty results could mask legitimate errors. Consider whether returning an empty vector for `UninitializedLog` is the correct behavior, as it might make it difficult for clients to distinguish between "no records exist" and "collection doesn't exist".
```rust
// Consider returning a more explicit error or status
Err(wal3::Error::UninitializedLog) => vec![], // This might hide collection existence issues
```
File: rust/log-service/src/lib.rs
Line: 1246
This PR will make the push_logs call initialize the log. Existing logs are unaffected. If a log doesn't exist, a not_found error will be returned or the request will be empty, as appropriate. This deletes the paths that push/pull logs in the go log service. CI This is part of the migration to the rust log service. We need to plan for how to roll it out such that the rust log service rolls before Go or the Frontends. Watch staging. N/A
@@ -248,7 +248,7 @@ async fn get_log_from_handle_with_mutex_held<'a>( | |||
_phantom: std::marker::PhantomData, | |||
}); | |||
} | |||
let opened = LogWriter::open( | |||
let opened = LogWriter::open_or_initialize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
Using LogWriter::open_or_initialize
instead of LogWriter::open
changes initialization behavior. If this replaces logs that were migrated from another service, this could create new empty logs instead of properly opening existing ones. Consider verifying this won't cause data loss for migrated collections.
// Verify the log exists before initializing a new one
let opened = match LogWriter::open(options.clone(), Arc::clone(storage), prefix, mark_dirty).await {
Ok(writer) => writer,
Err(wal3::Error::UninitializedLog) => {
return Err(Status::not_found(format!("collection {collection_id} not found")));
}
Err(err) => return Err(Status::unknown(err.to_string())),
};
Context for Agents
[**CriticalError**]
Using `LogWriter::open_or_initialize` instead of `LogWriter::open` changes initialization behavior. If this replaces logs that were migrated from another service, this could create new empty logs instead of properly opening existing ones. Consider verifying this won't cause data loss for migrated collections.
```rust
// Verify the log exists before initializing a new one
let opened = match LogWriter::open(options.clone(), Arc::clone(storage), prefix, mark_dirty).await {
Ok(writer) => writer,
Err(wal3::Error::UninitializedLog) => {
return Err(Status::not_found(format!("collection {collection_id} not found")));
}
Err(err) => return Err(Status::unknown(err.to_string())),
};
```
File: rust/log-service/src/lib.rs
Line: 251
@@ -1574,19 +1327,6 @@ impl LogServer { | |||
Arc::clone(&storage), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
The ForkLogs
method removes the check for uninitialized logs and proceeds with copying. This could cause issues if the source collection doesn't exist. The copy operation might create an empty target log even when the source is invalid.
// Add validation for source log existence
if let Err(err) = log_reader.next_write_timestamp().await {
match err {
wal3::Error::UninitializedLog => {
return Err(Status::not_found(format!(
"source collection {source_collection_id} not found"
)));
}
_ => {
return Err(Status::new(
err.code().into(),
format!("Failed to load log: {}", err),
));
}
}
}
Context for Agents
[**CriticalError**]
The `ForkLogs` method removes the check for uninitialized logs and proceeds with copying. This could cause issues if the source collection doesn't exist. The copy operation might create an empty target log even when the source is invalid.
```rust
// Add validation for source log existence
if let Err(err) = log_reader.next_write_timestamp().await {
match err {
wal3::Error::UninitializedLog => {
return Err(Status::not_found(format!(
"source collection {source_collection_id} not found"
)));
}
_ => {
return Err(Status::new(
err.code().into(),
format!("Failed to load log: {}", err),
));
}
}
}
```
File: rust/log-service/src/lib.rs
Line: 1327
Err(wal3::Error::UninitializedLog) => { | ||
tracing::info!("forwarding because log uninitialized"); | ||
return self | ||
.forward_push_logs(collection_id, Request::new(push_logs)) | ||
.await; | ||
return Err(Status::not_found(format!( | ||
"collection {collection_id} not found" | ||
))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
With the change from LogWriter::open
to LogWriter::open_or_initialize
(line 251), it seems that get_log_from_handle
will now create the log if it doesn't exist, and should no longer return wal3::Error::UninitializedLog
. If that's correct, this match arm might be dead code.
This also seems to contradict a part of the PR description which says for push_logs
"If a log doesn't exist, a not_found error will be returned". The code change to open_or_initialize
seems to implement "This PR will make the push_logs call initialize the log." instead, which feels more correct for a write operation.
Could you clarify if this error path is still reachable? If not, removing this arm would make the new behavior clearer.
Context for Agents
[**BestPractice**]
With the change from `LogWriter::open` to `LogWriter::open_or_initialize` (line 251), it seems that `get_log_from_handle` will now create the log if it doesn't exist, and should no longer return `wal3::Error::UninitializedLog`. If that's correct, this match arm might be dead code.
This also seems to contradict a part of the PR description which says for `push_logs` "If a log doesn't exist, a not_found error will be returned". The code change to `open_or_initialize` seems to implement "This PR will make the push_logs call initialize the log." instead, which feels more correct for a write operation.
Could you clarify if this error path is still reachable? If not, removing this arm would make the new behavior clearer.
File: rust/log-service/src/lib.rs
Line: 1083
## Description of changes This PR will make the push_logs call initialize the log. Existing logs are unaffected. If a log doesn't exist, a not_found error will be returned or the request will be empty, as appropriate. This deletes the paths that push/pull logs in the go log service. ## Test plan CI ## Migration plan This is part of the migration to the rust log service. We need to plan for how to roll it out such that the rust log service rolls before Go or the Frontends. ## Observability plan Watch staging. ## Documentation Changes N/A
This PR cherry-picks the commit 4de20e4 onto rc/2025-09-05. If there are unresolved conflicts, please resolve them manually. Co-authored-by: Robert Escriva <[email protected]>
Description of changes
This PR will make the push_logs call initialize the log. Existing logs
are unaffected. If a log doesn't exist, a not_found error will be
returned or the request will be empty, as appropriate.
This deletes the paths that push/pull logs in the go log service.
Test plan
CI
Migration plan
This is part of the migration to the rust log service.
We need to plan for how to roll it out such that the rust log service
rolls before Go or the Frontends.
Observability plan
Watch staging.
Documentation Changes
N/A