feat: Add support for OpenAI Responses API#4791
feat: Add support for OpenAI Responses API#4791wzh4464 wants to merge 5 commits intoAider-AI:mainfrom
Conversation
This commit adds support for OpenAI's Responses API alongside the existing Chat Completions API, allowing aider to work with newer models like gpt-5.1-codex and gpt-5.2-codex that only support the Responses API. Key changes: - Added `wire_api` field to ModelSettings (default: "chat") - Created ResponsesAPIWrapper to convert Responses API format to Chat Completions format - Added StreamingResponsesAPIWrapper for streaming responses - Modified send_completion() to route requests based on AIDER_WIRE_API env var - Updated simple_send_with_retries() to handle both API formats - Fixed grpcio version conflict (1.76.0 -> 1.67.0) for litellm compatibility Usage: export AIDER_WIRE_API="responses" # For Codex models aider --model gpt-5.1-codex Fixes Aider-AI#4039 (Codex-mini temperature parameter issue) Fixes Aider-AI#4707 (gpt-5.2-chat temperature parameter issue) Related to Aider-AI#4591 (gpt-5-codex vision support) The implementation uses litellm.responses() which was added in recent litellm versions to support the new Responses API endpoint.
There was a problem hiding this comment.
Pull request overview
This PR adds support for OpenAI's Responses API (/v1/responses) to enable compatibility with newer models like gpt-5.1-codex and gpt-5.2-codex that only support this API format, addressing issues #4039 and #4707 where these models failed with parameter errors.
Changes:
- Introduced
ResponsesAPIWrapperandStreamingResponsesAPIWrapperclasses to convert Responses API format to Chat Completions format for compatibility - Modified
send_completion()to route requests to eitherlitellm.responses()orlitellm.completion()based onAIDER_WIRE_APIenvironment variable or model'swire_apisetting - Updated
simple_send_with_retries()to handle both API response formats - Downgraded grpcio from 1.76.0 to 1.67.0 to resolve compatibility issue with litellm 1.80.10
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| requirements.txt | Downgraded grpcio from 1.76.0 to 1.67.0 for litellm compatibility |
| aider/models.py | Added ResponsesAPIWrapper classes, modified send_completion() and simple_send_with_retries() to support dual API routing via wire_api setting |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
aider/models.py
Outdated
| # Handle both chat completions and responses API formats | ||
| wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat")) | ||
|
|
||
| if wire_api == "responses": | ||
| # Responses API format: has 'output' instead of 'choices' | ||
| if not hasattr(response, "output") or not response.output: | ||
| return None | ||
| # Extract text content from output items | ||
| for item in response.output: | ||
| if hasattr(item, "text") and item.text: | ||
| res = item.text | ||
| break | ||
| elif isinstance(item, dict) and "text" in item: | ||
| res = item["text"] | ||
| break | ||
| else: | ||
| return None | ||
| else: | ||
| # Chat completions API format: has 'choices' | ||
| if not hasattr(response, "choices") or not response.choices: | ||
| return None | ||
| res = response.choices[0].message.content | ||
|
|
There was a problem hiding this comment.
The response handling logic in simple_send_with_retries is inconsistent with the wrapper design. In send_completion, the responses API result is wrapped with ResponsesAPIWrapper which converts it to have a choices attribute (lines 1133-1134). However, this function duplicates the conversion logic by checking wire_api == "responses" and directly accessing the raw output attribute (lines 1168-1181).
Since the wrapper already provides a uniform choices interface, this duplication is unnecessary and could lead to inconsistent behavior if the wrapper logic changes. The code should simply use response.choices[0].message.content for both API types, as the wrapper ensures compatibility.
| # Handle both chat completions and responses API formats | |
| wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat")) | |
| if wire_api == "responses": | |
| # Responses API format: has 'output' instead of 'choices' | |
| if not hasattr(response, "output") or not response.output: | |
| return None | |
| # Extract text content from output items | |
| for item in response.output: | |
| if hasattr(item, "text") and item.text: | |
| res = item.text | |
| break | |
| elif isinstance(item, dict) and "text" in item: | |
| res = item["text"] | |
| break | |
| else: | |
| return None | |
| else: | |
| # Chat completions API format: has 'choices' | |
| if not hasattr(response, "choices") or not response.choices: | |
| return None | |
| res = response.choices[0].message.content | |
| # Use the unified choices interface provided by send_completion / ResponsesAPIWrapper | |
| if not hasattr(response, "choices") or not response.choices: | |
| return None | |
| res = response.choices[0].message.content |
aider/models.py
Outdated
| other_messages = [msg for msg in messages if msg.get("role") != "system"] | ||
|
|
||
| if system_messages: | ||
| kwargs["instructions"] = system_messages[0].get("content", "") |
There was a problem hiding this comment.
When converting messages for the responses API, only the first system message is used (line 1123). If there are multiple system messages in the conversation, all subsequent ones are silently dropped. This could lead to loss of important context.
Consider either concatenating all system messages or documenting this limitation. Alternatively, raise a warning if multiple system messages are detected and being dropped.
| kwargs["instructions"] = system_messages[0].get("content", "") | |
| # Concatenate all system message contents to avoid dropping context | |
| instructions_parts = [ | |
| str(msg.get("content", "")) | |
| for msg in system_messages | |
| if msg.get("content") | |
| ] | |
| if instructions_parts: | |
| kwargs["instructions"] = "\n\n".join(instructions_parts) |
aider/models.py
Outdated
| for item in self._responses.output: | ||
| if hasattr(item, "text") and item.text: | ||
| choice.message.content = item.text | ||
| choice.delta.content = item.text | ||
| break | ||
| elif isinstance(item, dict): | ||
| if "text" in item: | ||
| choice.message.content = item["text"] | ||
| choice.delta.content = item["text"] | ||
| break |
There was a problem hiding this comment.
The wrapper only extracts the first text item from the response output (using break at line 77 or 82). If the responses API returns multiple output items, subsequent items are silently ignored. This could lead to incomplete responses if the API returns content in multiple parts.
Consider either concatenating all text items or documenting why only the first item is used.
| if hasattr(item, "text"): | ||
| mock_chunk.choices[0].delta.content = item.text | ||
| break | ||
|
|
There was a problem hiding this comment.
The StreamingResponsesAPIWrapper initializes finish_reason to None but never updates it based on the chunk data. The code in base_coder.py checks for finish_reason == "length" to detect when the response is truncated (lines 1907-1911 in base_coder.py). Without populating this field from the responses API chunks, this check will never work properly for responses API streaming.
Consider extracting and setting the finish_reason from the chunk if it's available.
| # Propagate finish_reason from the underlying chunk if available | |
| finish_reason = None | |
| # Check for finish_reason on delta first (most similar to chat completions) | |
| if hasattr(chunk, "delta") and hasattr(chunk.delta, "finish_reason"): | |
| finish_reason = chunk.delta.finish_reason | |
| # Fallback: finish_reason directly on the chunk | |
| elif hasattr(chunk, "finish_reason"): | |
| finish_reason = chunk.finish_reason | |
| # Fallback: look for finish_reason on output items | |
| elif hasattr(chunk, "output"): | |
| for item in chunk.output: | |
| if hasattr(item, "finish_reason"): | |
| finish_reason = item.finish_reason | |
| break | |
| if isinstance(item, dict) and "finish_reason" in item: | |
| finish_reason = item["finish_reason"] | |
| break | |
| mock_chunk.choices[0].finish_reason = finish_reason |
aider/models.py
Outdated
| kwargs["instructions"] = system_messages[0].get("content", "") | ||
|
|
||
| # For responses API, input can be a list of messages | ||
| kwargs["input"] = other_messages if other_messages else messages |
There was a problem hiding this comment.
When no system messages exist, the fallback kwargs["input"] = other_messages if other_messages else messages could result in passing the same messages list that might contain system messages. This happens because:
- If
other_messagesis empty (all messages were system), it falls back tomessageswhich includes the system messages - This could cause the responses API to receive system messages in the
inputfield even though they were filtered out
Consider using kwargs["input"] = other_messages unconditionally, or handle the empty case explicitly to avoid confusion.
| kwargs["input"] = other_messages if other_messages else messages | |
| kwargs["input"] = other_messages |
aider/models.py
Outdated
| # Extract content from output items | ||
| for item in self._responses.output: | ||
| if hasattr(item, "text") and item.text: | ||
| choice.message.content = item.text | ||
| choice.delta.content = item.text | ||
| break | ||
| elif isinstance(item, dict): | ||
| if "text" in item: | ||
| choice.message.content = item["text"] | ||
| choice.delta.content = item["text"] | ||
| break |
There was a problem hiding this comment.
The wrapper initializes tool_calls in the mock message object but never populates it from the responses API output (line 53). However, the code in send_completion can add function tools to the request (lines 1086-1089). If the responses API supports function calling and returns tool call data, it won't be extracted, potentially breaking function calling for responses API.
Verify whether the responses API supports function/tool calling, and if so, extract and populate the tool_calls field from the response output.
| # Extract content from output items | |
| for item in self._responses.output: | |
| if hasattr(item, "text") and item.text: | |
| choice.message.content = item.text | |
| choice.delta.content = item.text | |
| break | |
| elif isinstance(item, dict): | |
| if "text" in item: | |
| choice.message.content = item["text"] | |
| choice.delta.content = item["text"] | |
| break | |
| # Extract content and tool_calls from output items | |
| for item in self._responses.output: | |
| # Try to extract tool_calls if present on this item | |
| if getattr(choice.message, "tool_calls", None) is None: | |
| # Attribute-style tool_calls/tool_call | |
| if hasattr(item, "tool_calls") and getattr(item, "tool_calls"): | |
| choice.message.tool_calls = getattr(item, "tool_calls") | |
| elif hasattr(item, "tool_call") and getattr(item, "tool_call"): | |
| # Normalize single tool_call to a list for compatibility | |
| single_call = getattr(item, "tool_call") | |
| choice.message.tool_calls = ( | |
| single_call if isinstance(single_call, list) else [single_call] | |
| ) | |
| # Dict-style tool_calls/tool_call | |
| elif isinstance(item, dict): | |
| if "tool_calls" in item and item["tool_calls"]: | |
| choice.message.tool_calls = item["tool_calls"] | |
| elif "tool_call" in item and item["tool_call"]: | |
| single_call = item["tool_call"] | |
| choice.message.tool_calls = ( | |
| single_call if isinstance(single_call, list) else [single_call] | |
| ) | |
| # Extract text content, preserving existing behavior | |
| if hasattr(item, "text") and getattr(item, "text", None): | |
| choice.message.content = item.text | |
| choice.delta.content = item.text | |
| break | |
| elif isinstance(item, dict) and "text" in item and item["text"]: | |
| choice.message.content = item["text"] | |
| choice.delta.content = item["text"] | |
| break |
- Added fastapi==0.128.0 (required by litellm.responses()) - Added orjson==3.11.5 (required by litellm for fast JSON serialization) - Maintained grpcio==1.67.0 for litellm 1.80.10 compatibility - Updated related dependency constraints These dependencies are necessary to avoid runtime import errors when using the Responses API functionality.
The OpenAI Responses API returns ResponseOutputMessage objects with a nested structure: output -> message -> content[0] -> text Updated both ResponsesAPIWrapper and StreamingResponsesAPIWrapper to: 1. Check for message type items in output 2. Extract text from nested content items 3. Include fallback for simpler response formats This fixes the 'Empty response received from LLM' error when using Responses API with Codex models.
The Responses API uses event-based streaming with OUTPUT_TEXT_DELTA events that have a 'delta' attribute containing the text, rather than nested output structures. Updated StreamingResponsesAPIWrapper to check for event.delta first. This fixes the "Empty response received from LLM" error when using streaming mode with gpt-5.1-codex and gpt-5.2-codex models. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
aider/models.py
Outdated
| # Handle both chat completions and responses API formats | ||
| wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat")) | ||
|
|
||
| if wire_api == "responses": | ||
| # Responses API format: has 'output' instead of 'choices' | ||
| if not hasattr(response, "output") or not response.output: | ||
| return None | ||
| # Extract text content from output items | ||
| for item in response.output: | ||
| if hasattr(item, "text") and item.text: | ||
| res = item.text | ||
| break | ||
| elif isinstance(item, dict) and "text" in item: | ||
| res = item["text"] | ||
| break | ||
| else: | ||
| return None | ||
| else: | ||
| # Chat completions API format: has 'choices' | ||
| if not hasattr(response, "choices") or not response.choices: | ||
| return None | ||
| res = response.choices[0].message.content | ||
|
|
There was a problem hiding this comment.
The ResponsesAPIWrapper creates a hybrid object that has both a choices attribute (chat completions format) and falls back to the original response object for other attributes like output via getattr. This means wrapped responses support both interfaces. In simple_send_with_retries, when wire_api is "responses", the code checks response.output - which works via getattr but is confusing because send_completion already wrapped it with choices. This dual-interface approach makes the code harder to understand and maintain. Consider either: (1) using only the choices interface and having simple_send_with_retries always use response.choices regardless of wire_api, or (2) not wrapping at all and handling the format conversion at the point of consumption.
| # Handle both chat completions and responses API formats | |
| wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat")) | |
| if wire_api == "responses": | |
| # Responses API format: has 'output' instead of 'choices' | |
| if not hasattr(response, "output") or not response.output: | |
| return None | |
| # Extract text content from output items | |
| for item in response.output: | |
| if hasattr(item, "text") and item.text: | |
| res = item.text | |
| break | |
| elif isinstance(item, dict) and "text" in item: | |
| res = item["text"] | |
| break | |
| else: | |
| return None | |
| else: | |
| # Chat completions API format: has 'choices' | |
| if not hasattr(response, "choices") or not response.choices: | |
| return None | |
| res = response.choices[0].message.content | |
| # Use the normalized chat-completions-style interface | |
| if not hasattr(response, "choices") or not response.choices: | |
| return None | |
| res = response.choices[0].message.content |
aider/models.py
Outdated
| def _convert_to_chat_format(self): | ||
| """Convert responses API output to chat completions choices format""" | ||
| if not hasattr(self._responses, "output"): | ||
| return |
There was a problem hiding this comment.
If the response object doesn't have an "output" attribute, _convert_to_chat_format returns early without initializing self.choices. This means ResponsesAPIWrapper instances might not have a choices attribute, which will cause AttributeErrors when the code tries to access response.choices elsewhere (e.g., in base_coder.py line 1850). The wrapper should always initialize self.choices to at least an empty list to ensure a consistent interface.
| choice = MockChoice() | ||
|
|
||
| # Extract content from output items | ||
| for item in self._responses.output: | ||
| # Handle ResponseOutputMessage type | ||
| if hasattr(item, "type") and item.type == "message": | ||
| if hasattr(item, "content") and item.content: | ||
| # Extract text from content items | ||
| for content_item in item.content: | ||
| if hasattr(content_item, "text") and content_item.text: | ||
| choice.message.content = content_item.text | ||
| choice.delta.content = content_item.text | ||
| break | ||
| if choice.message.content: | ||
| break | ||
| # Fallback: direct text attribute | ||
| elif hasattr(item, "text") and item.text: | ||
| choice.message.content = item.text | ||
| choice.delta.content = item.text | ||
| break | ||
| # Fallback: dict format | ||
| elif isinstance(item, dict): | ||
| if "text" in item: | ||
| choice.message.content = item["text"] | ||
| choice.delta.content = item["text"] | ||
| break | ||
|
|
||
| self.choices = [choice] |
There was a problem hiding this comment.
The wrapper creates mock objects with content initialized to None, but doesn't validate that content was successfully extracted before creating the choices list. If no valid content is found in any output item (all branches fail or items are empty), choice.message.content and choice.delta.content will remain None. While this might be acceptable for some edge cases, it could lead to unexpected None values propagating through the codebase. Consider initializing content to an empty string instead of None for better consistency with the chat completions API behavior.
| self.github_copilot_token_to_open_ai_key(kwargs["extra_headers"]) | ||
|
|
||
| res = litellm.completion(**kwargs) | ||
| # Use responses API or chat completions API based on wire_api setting |
There was a problem hiding this comment.
The code doesn't validate the wire_api value. If AIDER_WIRE_API or the model's wire_api setting contains an invalid value (e.g., "foo"), the code will silently default to using the chat completions API because the if condition on line 1141 only checks for "responses". This could lead to confusion if users mistype the value. Consider adding validation to raise a clear error for invalid wire_api values, or at least log a warning when an unrecognized value is encountered.
| # Use responses API or chat completions API based on wire_api setting | |
| # Use responses API or chat completions API based on wire_api setting | |
| # Validate wire_api to catch misconfigurations (e.g., typos) | |
| valid_wire_api_values = {None, "", "responses"} | |
| if wire_api not in valid_wire_api_values: | |
| sys.stderr.write( | |
| f"Warning: Unrecognized wire_api value '{wire_api}'. " | |
| "Falling back to chat completions API.\n" | |
| ) |
aider/models.py
Outdated
| choice.delta.content = content_item.text | ||
| break | ||
| if choice.message.content: | ||
| break | ||
| # Fallback: direct text attribute | ||
| elif hasattr(item, "text") and item.text: | ||
| choice.message.content = item.text | ||
| choice.delta.content = item.text | ||
| break | ||
| # Fallback: dict format | ||
| elif isinstance(item, dict): | ||
| if "text" in item: | ||
| choice.message.content = item["text"] | ||
| choice.delta.content = item["text"] |
There was a problem hiding this comment.
In ResponsesAPIWrapper, both message.content and delta.content are set to the same value (lines 80-81, 87-88, 93-94). The delta attribute is typically used for streaming responses, while message is for non-streaming. Setting both for a non-streaming response wrapper is unconventional and could cause confusion. For non-streaming responses, only message.content should be set, with delta.content remaining None or unset to match the standard chat completions API behavior.
| choice.delta.content = content_item.text | |
| break | |
| if choice.message.content: | |
| break | |
| # Fallback: direct text attribute | |
| elif hasattr(item, "text") and item.text: | |
| choice.message.content = item.text | |
| choice.delta.content = item.text | |
| break | |
| # Fallback: dict format | |
| elif isinstance(item, dict): | |
| if "text" in item: | |
| choice.message.content = item["text"] | |
| choice.delta.content = item["text"] | |
| break | |
| if choice.message.content: | |
| break | |
| # Fallback: direct text attribute | |
| elif hasattr(item, "text") and item.text: | |
| choice.message.content = item.text | |
| break | |
| # Fallback: dict format | |
| elif isinstance(item, dict): | |
| if "text" in item: | |
| choice.message.content = item["text"] |
| def __next__(self): | ||
| event = next(self._stream) | ||
|
|
||
| # Wrap each event to look like chat completions format | ||
| class MockChoice: | ||
| def __init__(self): | ||
| self.delta = type( | ||
| "obj", | ||
| (object,), | ||
| { | ||
| "content": None, | ||
| "function_call": None, | ||
| "reasoning_content": None, | ||
| "reasoning": None, | ||
| }, | ||
| )() | ||
| self.finish_reason = None | ||
|
|
||
| class MockChunk: | ||
| def __init__(self): | ||
| self.choices = [MockChoice()] | ||
|
|
||
| mock_chunk = MockChunk() | ||
|
|
||
| # Handle Responses API event stream format | ||
| # Check for OUTPUT_TEXT_DELTA events (have delta attribute with text) | ||
| if hasattr(event, "delta") and event.delta: | ||
| mock_chunk.choices[0].delta.content = event.delta | ||
| # Fallback for other formats | ||
| elif hasattr(event, "output"): | ||
| for item in event.output: | ||
| # Handle ResponseOutputMessage type | ||
| if hasattr(item, "type") and item.type == "message": | ||
| if hasattr(item, "content") and item.content: | ||
| for content_item in item.content: | ||
| if hasattr(content_item, "text") and content_item.text: | ||
| mock_chunk.choices[0].delta.content = content_item.text | ||
| break | ||
| if mock_chunk.choices[0].delta.content: | ||
| break | ||
| # Fallback: direct text attribute | ||
| elif hasattr(item, "text"): | ||
| mock_chunk.choices[0].delta.content = item.text | ||
| break | ||
|
|
||
| return mock_chunk |
There was a problem hiding this comment.
If an event in the streaming response has neither a "delta" attribute nor an "output" attribute, mock_chunk.choices[0].delta.content will remain None. While this might be intentional for some event types, the code doesn't handle StopIteration when the underlying stream ends. The next method should properly handle stream termination and raise StopIteration when appropriate to prevent infinite loops or unexpected behavior in consuming code.
aider/models.py
Outdated
| kwargs["instructions"] = system_messages[0].get("content", "") | ||
|
|
||
| # For responses API, input can be a list of messages | ||
| kwargs["input"] = other_messages if other_messages else messages |
There was a problem hiding this comment.
When there are no non-system messages, the code sets kwargs["input"] = messages (which includes system messages). However, if system messages were found, they're already extracted into kwargs["instructions"], resulting in system messages being sent twice - once in "instructions" and once in "input". The logic should ensure system messages are excluded from "input" when they've been extracted to "instructions".
| kwargs["input"] = other_messages if other_messages else messages | |
| # Avoid sending system messages twice: once as instructions and again as input. | |
| if system_messages: | |
| # When system messages are present, only send non-system messages as input | |
| kwargs["input"] = other_messages | |
| else: | |
| # When there are no system messages, preserve existing behavior | |
| kwargs["input"] = other_messages if other_messages else messages |
| class ResponsesAPIWrapper: | ||
| """Wrapper to convert Responses API format to Chat Completions format""" | ||
|
|
||
| def __init__(self, responses_obj): | ||
| self._responses = responses_obj | ||
| self._convert_to_chat_format() | ||
|
|
||
| def _convert_to_chat_format(self): | ||
| """Convert responses API output to chat completions choices format""" | ||
| if not hasattr(self._responses, "output"): | ||
| return | ||
|
|
||
| # Create a mock choices structure | ||
| class MockChoice: | ||
| def __init__(self): | ||
| self.message = type( | ||
| "obj", | ||
| (object,), | ||
| { | ||
| "content": None, | ||
| "tool_calls": None, | ||
| "reasoning_content": None, | ||
| "reasoning": None, | ||
| }, | ||
| )() | ||
| self.delta = type( | ||
| "obj", | ||
| (object,), | ||
| { | ||
| "content": None, | ||
| "function_call": None, | ||
| "reasoning_content": None, | ||
| "reasoning": None, | ||
| }, | ||
| )() | ||
| self.finish_reason = None | ||
|
|
||
| choice = MockChoice() | ||
|
|
||
| # Extract content from output items | ||
| for item in self._responses.output: | ||
| # Handle ResponseOutputMessage type | ||
| if hasattr(item, "type") and item.type == "message": | ||
| if hasattr(item, "content") and item.content: | ||
| # Extract text from content items | ||
| for content_item in item.content: | ||
| if hasattr(content_item, "text") and content_item.text: | ||
| choice.message.content = content_item.text | ||
| choice.delta.content = content_item.text | ||
| break | ||
| if choice.message.content: | ||
| break | ||
| # Fallback: direct text attribute | ||
| elif hasattr(item, "text") and item.text: | ||
| choice.message.content = item.text | ||
| choice.delta.content = item.text | ||
| break | ||
| # Fallback: dict format | ||
| elif isinstance(item, dict): | ||
| if "text" in item: | ||
| choice.message.content = item["text"] | ||
| choice.delta.content = item["text"] | ||
| break | ||
|
|
||
| self.choices = [choice] | ||
|
|
||
| # Copy other attributes | ||
| if hasattr(self._responses, "id"): | ||
| self.id = self._responses.id | ||
| if hasattr(self._responses, "usage"): | ||
| self.usage = self._responses.usage | ||
|
|
||
| def __getattr__(self, name): | ||
| """Fallback to original responses object for other attributes""" | ||
| return getattr(self._responses, name) | ||
|
|
||
|
|
||
| class StreamingResponsesAPIWrapper: | ||
| """Wrapper for streaming responses API to mimic chat completions stream""" | ||
|
|
||
| def __init__(self, responses_stream): | ||
| self._stream = responses_stream | ||
|
|
||
| def __iter__(self): | ||
| return self | ||
|
|
||
| def __next__(self): | ||
| event = next(self._stream) | ||
|
|
||
| # Wrap each event to look like chat completions format | ||
| class MockChoice: | ||
| def __init__(self): | ||
| self.delta = type( | ||
| "obj", | ||
| (object,), | ||
| { | ||
| "content": None, | ||
| "function_call": None, | ||
| "reasoning_content": None, | ||
| "reasoning": None, | ||
| }, | ||
| )() | ||
| self.finish_reason = None | ||
|
|
||
| class MockChunk: | ||
| def __init__(self): | ||
| self.choices = [MockChoice()] | ||
|
|
||
| mock_chunk = MockChunk() | ||
|
|
||
| # Handle Responses API event stream format | ||
| # Check for OUTPUT_TEXT_DELTA events (have delta attribute with text) | ||
| if hasattr(event, "delta") and event.delta: | ||
| mock_chunk.choices[0].delta.content = event.delta | ||
| # Fallback for other formats | ||
| elif hasattr(event, "output"): | ||
| for item in event.output: | ||
| # Handle ResponseOutputMessage type | ||
| if hasattr(item, "type") and item.type == "message": | ||
| if hasattr(item, "content") and item.content: | ||
| for content_item in item.content: | ||
| if hasattr(content_item, "text") and content_item.text: | ||
| mock_chunk.choices[0].delta.content = content_item.text | ||
| break | ||
| if mock_chunk.choices[0].delta.content: | ||
| break | ||
| # Fallback: direct text attribute | ||
| elif hasattr(item, "text"): | ||
| mock_chunk.choices[0].delta.content = item.text | ||
| break | ||
|
|
||
| return mock_chunk | ||
|
|
There was a problem hiding this comment.
The new ResponsesAPIWrapper and StreamingResponsesAPIWrapper classes, as well as the responses API integration in send_completion and simple_send_with_retries, lack test coverage. Given that the repository has comprehensive test coverage for other model functionality (see tests/basic/test_models.py and tests/basic/test_sendchat.py), tests should be added to verify: (1) ResponsesAPIWrapper correctly converts responses API format to chat completions format, (2) StreamingResponsesAPIWrapper properly handles streaming events, (3) send_completion routes to the correct API based on wire_api setting, and (4) simple_send_with_retries correctly extracts responses from both API formats.
| remove_reasoning: Optional[str] = None # Deprecated alias for reasoning_tag | ||
| system_prompt_prefix: Optional[str] = None | ||
| accepts_settings: Optional[list] = None | ||
| wire_api: str = "chat" # "chat" for Chat Completions API, "responses" for Responses API |
There was a problem hiding this comment.
The new AIDER_WIRE_API environment variable and wire_api model setting are not documented. Users need to know: (1) the purpose of this setting, (2) valid values ("chat" or "responses"), (3) when to use each option, and (4) that the environment variable overrides the model setting. Consider adding documentation in the appropriate user-facing documentation files or adding docstring documentation to the ModelSettings class explaining the wire_api field.
requirements/requirements.in
Outdated
| pypandoc | ||
| litellm | ||
| fastapi # Required by litellm.responses() API | ||
| orjson # Required by litellm for fast JSON serialization |
There was a problem hiding this comment.
The comment states that orjson is "Required by litellm for fast JSON serialization", but this is not entirely accurate. While orjson may be used by litellm if available, it's not clear if it's strictly required or just optional. The comment should clarify whether this is a hard requirement for litellm.responses() support or if it's optional. If it's optional, the comment should indicate what functionality is lost without it.
| orjson # Required by litellm for fast JSON serialization | |
| orjson # Optional: used by litellm for faster JSON serialization; without it, standard json is used and may be slower |
- Always initialize self.choices in ResponsesAPIWrapper for consistent interface - Concatenate multiple system messages instead of dropping them - Simplify simple_send_with_retries to use unified choices interface - Add finish_reason extraction in StreamingResponsesAPIWrapper for truncation detection - Add wire_api value validation with warning for invalid values - Initialize content to empty string instead of None - Remove delta.content from non-streaming wrapper (only for streaming) - Properly handle StopIteration in streaming wrapper - Improve dependency comments Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
13d0fe8 to
0a4044d
Compare
Summary
This PR adds support for OpenAI's Responses API alongside the existing Chat Completions API, enabling aider to work with newer models like
gpt-5.1-codexandgpt-5.2-codexthat only support the Responses API.Motivation
Several issues (#4039, #4707) report that Codex and GPT-5 models fail with parameter errors when using aider. The root cause is that these models only support the Responses API (
/v1/responses), not the Chat Completions API (/v1/chat/completions).Changes
Core Implementation
wire_apifield toModelSettings(default:"chat")ResponsesAPIWrapperto convert Responses API format to Chat Completions format for compatibilityResponseOutputMessageobjects with nestedcontentarrayscontent[].textfieldsStreamingResponsesAPIWrapperfor streaming responsesOUTPUT_TEXT_DELTAevents withdeltaattributesend_completion()to route requests based onAIDER_WIRE_APIenvironment variablesimple_send_with_retries()to handle both API formatsDependencies
grpcioversion conflict:1.76.0→1.67.0litellm 1.80.10which requiresgrpcio<1.68.0litellm.responses()API supportfastapi: Required bylitellm.responses()APIorjson: Required by litellm for fast JSON serializationStreaming Fix (commit 4ee8c9b)
The initial implementation incorrectly assumed streaming responses would have a simple
chunk.output[].textstructure. Testing revealed that Responses API uses an event-based streaming format:OUTPUT_TEXT_DELTAcontain adeltaattribute with the textevent.deltafirst, then falls back to the nested output structureUsage
Testing
Tested successfully with:
gpt-5.1-codexhttps://api.chatanywhere.org/v1)Fixes
Notes
litellm.responses()which was added in recent litellm versionsAIDER_WIRE_API) allows flexible switching between API typeswire_apiper-model in YAML configurationKnown Issue (Not in Scope)
During testing in git repositories, we encountered a tree-sitter compatibility issue (
AttributeError: 'tree_sitter.Query' object has no attribute 'captures'at repomap.py:289). This is a separate issue affecting the main branch with tree-sitter 0.25.x that occurs during initial repo scanning. It is already being addressed in PR #4369 and is not related to the Responses API changes in this PR.CLA
I have read and agree to the Individual Contributor License Agreement.