-
Notifications
You must be signed in to change notification settings - Fork 65
feat: add Clone support and HTTP client pooling for providers #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| pub fn client(mut self, client: reqwest::Client) -> Self { | ||
| self.client = Some(client); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apply builder client to all backends or document limits
When a caller sets .client(...) and then builds any backend other than OpenAI/Anthropic, the custom reqwest::Client is silently ignored because LLMBuilder::build only consults self.client in those two branches and all other backends still call their new/with_config constructors with fresh clients. This means connection pooling/proxy/TLS/timeouts supplied via .client() never take effect for providers like Google/Ollama/DeepSeek, which violates the new method’s documented behavior and can lead to unexpected networking behavior for those backends.
Useful? React with 👍 / 👎.
a382292 to
3826f38
Compare
3826f38 to
b991514
Compare
|
Hi, really appreciate the PR, but there is one significant problem. I see that you have added |
|
I think if we delete the let people decide and wrap the structs in thoughts @nazq ? |
|
I always implement everything from scratch. this was my claude implementation, adding an use std::{borrow::Cow, sync::Arc};
use crate::claude::config::ClaudeConfig;
use crate::claude::types::{ClaudeMessage, ClaudeRequest, ClaudeResponse};
use reqwest::header::HeaderMap;
const API_BASE: &str = "https://api.anthropic.com/v1";
#[derive(Clone)]
pub struct ClaudeApiClient {
client: reqwest::Client,
config: Arc<ClaudeConfig>,
}
impl ClaudeApiClient {
pub fn init(config: ClaudeConfig) -> anyhow::Result<Self> {
let mut headers = HeaderMap::new();
// Add authentication
headers.insert("x-api-key", config.api_key.parse()?);
headers.insert("content-type", "application/json".parse().unwrap());
headers.insert("anthropic-version", "2023-06-01".parse().unwrap());
// reqwest client
let client = reqwest::Client::builder()
.default_headers(headers)
.build()?;
// self
Ok(Self {
client,
config: Arc::new(config),
})
}
pub async fn send_message(
&self,
tools: &[serde_json::Value],
messages: Vec<ClaudeMessage<'_>>,
system_prompt: &str,
) -> Result<ClaudeResponse<'static>, tonic::Status> {
// Create the request from the config
let request = ClaudeRequest {
model: Cow::Borrowed(self.config.model),
max_tokens: self.config.max_tokens,
messages,
system: Some(Cow::Borrowed(system_prompt)),
temperature: self.config.temperature,
top_p: self.config.top_p,
top_k: self.config.top_k,
tools: Some(Cow::Borrowed(tools)),
};
// Send the request
let res = self
.client
.post(format!("{}/messages", API_BASE))
.json(&request)
.send()
.await
.map_err(|e| {
tracing::warn!("Failed to send request to Claude API: {}", e);
tonic::Status::unavailable("Failed to connect to Claude API")
})?;
// Parse if success
let status_code = res.status();
if status_code.is_success() {
let text = res.text().await.map_err(|e| {
tracing::warn!("Failed to read response text: {}", e);
tonic::Status::internal("Failed to read response text")
})?;
return serde_json::from_str::<ClaudeResponse<'static>>(&text)
.inspect_err(|err| tracing::warn!("Failed to parse Claude api response {err}"))
.map_err(|_| tonic::Status::internal("Failed to parse Claude api response"));
}
let error_text = res.text().await.unwrap_or_else(|e| {
tracing::warn!("Failed to read error response: {}", e);
"Unknown error".to_string()
});
// Log the full error details internally
tracing::error!(
status_code = %status_code,
response_body = %error_text,
model = %self.config.model,
"Claude API request failed"
);
// Map to appropriate tonic::Status based on status code
match status_code.as_u16() {
429 => Err(tonic::Status::resource_exhausted(
"Claude API Rate limit exceeded",
)),
500..=599 => Err(tonic::Status::unavailable("Claude API is unavailable")),
_ => Err(tonic::Status::internal("Claude API error")),
}
}
}
#[derive(Debug, serde::Serialize, serde::Deserialize, Default)]
pub struct ClaudeConfig {
pub api_key: Cow<'static, str>,
pub model: &'static str,
pub max_tokens: u32,
pub temperature: Option<f32>,
pub top_p: Option<f32>,
pub top_k: Option<u32>,
} |
|
I like the
I think this makes more sense than my approach. @graniet at some point it may be worth thinking about a breaking change release and brainstorm any API changes you may have in mind. |
Summary
Clonederive to all provider structs, enabling shared ownershipwith_client()constructors for HTTP client injection and connection pooling.client()builder method forLLMBuilderCloses #91
Motivation
As noted in #91, creating multiple provider instances currently means each one allocates its own
reqwest::Clientand connection pool. This is not ideal when users want to:Approach
Non-breaking addition: Rather than modifying the existing
new()constructors (which would be a breaking change), I added a parallelwith_client()constructor to each backend. This preserves backward compatibility.If preferred, a future major version could consolidate these into
new()withOption<Client>. Happy to make the change if agreed.Changes
Clone Support
All provider structs now derive
Clone:Anthropic,OpenAI,Google,Ollama,DeepSeek,XAI,AzureOpenAI,Phind,ElevenLabsOpenAICompatibleProvider<T>(and config structs:MistralConfig,GroqConfig, etc.)This works because
reqwest::Clientis internallyArc-wrapped, making clones cheap.Client Injection
New
with_client()constructors on all backends:Builder Support
New
.client()method onLLMBuilder:Test Plan
cargo checkpassescargo clippypassescargo test- 31/31 unit tests pass (6 new tests added)cargo build --releasepassesNew Tests Added
test_anthropic_clonesrc/backends/anthropic.rstest_anthropic_with_clientsrc/backends/anthropic.rstest_openai_compatible_provider_clonesrc/providers/openai_compatible.rstest_openai_compatible_provider_with_clientsrc/providers/openai_compatible.rstest_builder_client_methodsrc/builder.rstest_builder_without_clientsrc/builder.rs