feat: Add support for OpenAI Responses API by wzh4464 · Pull Request #4791 · Aider-AI/aider

wzh4464 · 2026-01-23T06:45:47Z

Summary

This PR adds support for OpenAI's Responses API alongside the existing Chat Completions API, enabling aider to work with newer models like gpt-5.1-codex and gpt-5.2-codex that only support the Responses API.

Motivation

Several issues (#4039, #4707) report that Codex and GPT-5 models fail with parameter errors when using aider. The root cause is that these models only support the Responses API (/v1/responses), not the Chat Completions API (/v1/chat/completions).

Changes

Core Implementation

Added wire_api field to ModelSettings (default: "chat")
Created ResponsesAPIWrapper to convert Responses API format to Chat Completions format for compatibility
- Handles ResponseOutputMessage objects with nested content arrays
- Extracts text from content[].text fields
Created StreamingResponsesAPIWrapper for streaming responses
- Fixed in commit 4ee8c9b: Updated to handle Responses API event-based streaming format
- Correctly processes OUTPUT_TEXT_DELTA events with delta attribute
Modified send_completion() to route requests based on AIDER_WIRE_API environment variable
Updated simple_send_with_retries() to handle both API formats

Dependencies

Fixed grpcio version conflict: 1.76.0 → 1.67.0
- Resolves incompatibility with litellm 1.80.10 which requires grpcio<1.68.0
- The newer litellm version is needed for litellm.responses() API support
Added fastapi: Required by litellm.responses() API
Added orjson: Required by litellm for fast JSON serialization

Streaming Fix (commit `4ee8c9b`)

The initial implementation incorrectly assumed streaming responses would have a simple chunk.output[].text structure. Testing revealed that Responses API uses an event-based streaming format:

Events like OUTPUT_TEXT_DELTA contain a delta attribute with the text
The fix checks for event.delta first, then falls back to the nested output structure
This resolves "Empty response received from LLM" errors in streaming mode

Usage

# Use Responses API for Codex models
export AIDER_WIRE_API="responses"
aider --model gpt-5.1-codex

# Or use Chat Completions API (default)
export AIDER_WIRE_API="chat"
aider --model gpt-4

Testing

Tested successfully with:

Model: gpt-5.1-codex
API: ChatAnywhere (https://api.chatanywhere.org/v1)
Test cases:
- Non-streaming: Adding text to files ✅
- Streaming: Real-time code generation ✅
- Git repo with repo-map: Architecture analysis ✅
Code formatting: Passed pre-commit hooks (isort, black, flake8, codespell) ✅

Fixes

Fixes Codex-mini don't work with litellm #4039 (Codex-mini temperature parameter issue)
Fixes Error when using azure/gpt-5.2-chat #4707 (gpt-5.2-chat temperature parameter issue)
Related to openrouter/openai/gpt-5 is missing vision support #4591 (gpt-5-codex model support)

Notes

The implementation uses litellm.responses() which was added in recent litellm versions
Responses are wrapped to maintain compatibility with existing code expecting Chat Completions format
Environment variable (AIDER_WIRE_API) allows flexible switching between API types
Model settings can also specify wire_api per-model in YAML configuration

Known Issue (Not in Scope)

During testing in git repositories, we encountered a tree-sitter compatibility issue (AttributeError: 'tree_sitter.Query' object has no attribute 'captures' at repomap.py:289). This is a separate issue affecting the main branch with tree-sitter 0.25.x that occurs during initial repo scanning. It is already being addressed in PR #4369 and is not related to the Responses API changes in this PR.

CLA

I have read and agree to the Individual Contributor License Agreement.

This commit adds support for OpenAI's Responses API alongside the existing Chat Completions API, allowing aider to work with newer models like gpt-5.1-codex and gpt-5.2-codex that only support the Responses API. Key changes: - Added `wire_api` field to ModelSettings (default: "chat") - Created ResponsesAPIWrapper to convert Responses API format to Chat Completions format - Added StreamingResponsesAPIWrapper for streaming responses - Modified send_completion() to route requests based on AIDER_WIRE_API env var - Updated simple_send_with_retries() to handle both API formats - Fixed grpcio version conflict (1.76.0 -> 1.67.0) for litellm compatibility Usage: export AIDER_WIRE_API="responses" # For Codex models aider --model gpt-5.1-codex Fixes Aider-AI#4039 (Codex-mini temperature parameter issue) Fixes Aider-AI#4707 (gpt-5.2-chat temperature parameter issue) Related to Aider-AI#4591 (gpt-5-codex vision support) The implementation uses litellm.responses() which was added in recent litellm versions to support the new Responses API endpoint.

CLAassistant · 2026-01-23T06:45:54Z

All committers have signed the CLA.

Copilot

Pull request overview

This PR adds support for OpenAI's Responses API (/v1/responses) to enable compatibility with newer models like gpt-5.1-codex and gpt-5.2-codex that only support this API format, addressing issues #4039 and #4707 where these models failed with parameter errors.

Changes:

Introduced ResponsesAPIWrapper and StreamingResponsesAPIWrapper classes to convert Responses API format to Chat Completions format for compatibility
Modified send_completion() to route requests to either litellm.responses() or litellm.completion() based on AIDER_WIRE_API environment variable or model's wire_api setting
Updated simple_send_with_retries() to handle both API response formats
Downgraded grpcio from 1.76.0 to 1.67.0 to resolve compatibility issue with litellm 1.80.10

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 8 comments.

File	Description
requirements.txt	Downgraded grpcio from 1.76.0 to 1.67.0 for litellm compatibility
aider/models.py	Added ResponsesAPIWrapper classes, modified send_completion() and simple_send_with_retries() to support dual API routing via wire_api setting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-23T06:52:36Z

aider/models.py

+                # Handle both chat completions and responses API formats
+                wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat"))
+
+                if wire_api == "responses":
+                    # Responses API format: has 'output' instead of 'choices'
+                    if not hasattr(response, "output") or not response.output:
+                        return None
+                    # Extract text content from output items
+                    for item in response.output:
+                        if hasattr(item, "text") and item.text:
+                            res = item.text
+                            break
+                        elif isinstance(item, dict) and "text" in item:
+                            res = item["text"]
+                            break
+                    else:
+                        return None
+                else:
+                    # Chat completions API format: has 'choices'
+                    if not hasattr(response, "choices") or not response.choices:
+                        return None
+                    res = response.choices[0].message.content
+


The response handling logic in simple_send_with_retries is inconsistent with the wrapper design. In send_completion, the responses API result is wrapped with ResponsesAPIWrapper which converts it to have a choices attribute (lines 1133-1134). However, this function duplicates the conversion logic by checking wire_api == "responses" and directly accessing the raw output attribute (lines 1168-1181).

Since the wrapper already provides a uniform choices interface, this duplication is unnecessary and could lead to inconsistent behavior if the wrapper logic changes. The code should simply use response.choices[0].message.content for both API types, as the wrapper ensures compatibility.

Suggested change

# Handle both chat completions and responses API formats

wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat"))

if wire_api == "responses":

# Responses API format: has 'output' instead of 'choices'

if not hasattr(response, "output") or not response.output:

return None

# Extract text content from output items

for item in response.output:

if hasattr(item, "text") and item.text:

res = item.text

break

elif isinstance(item, dict) and "text" in item:

res = item["text"]

break

else:

return None

else:

# Chat completions API format: has 'choices'

if not hasattr(response, "choices") or not response.choices:

return None

res = response.choices[0].message.content

# Use the unified choices interface provided by send_completion / ResponsesAPIWrapper

if not hasattr(response, "choices") or not response.choices:

return None

res = response.choices[0].message.content

Copilot · 2026-01-23T06:52:36Z

aider/models.py

+            other_messages = [msg for msg in messages if msg.get("role") != "system"]
+
+            if system_messages:
+                kwargs["instructions"] = system_messages[0].get("content", "")


When converting messages for the responses API, only the first system message is used (line 1123). If there are multiple system messages in the conversation, all subsequent ones are silently dropped. This could lead to loss of important context.

Consider either concatenating all system messages or documenting this limitation. Alternatively, raise a warning if multiple system messages are detected and being dropped.

Suggested change

kwargs["instructions"] = system_messages[0].get("content", "")

# Concatenate all system message contents to avoid dropping context

instructions_parts = [

str(msg.get("content", ""))

for msg in system_messages

if msg.get("content")

]

if instructions_parts:

kwargs["instructions"] = "\n\n".join(instructions_parts)

Copilot · 2026-01-23T06:52:36Z

aider/models.py

+        for item in self._responses.output:
+            if hasattr(item, "text") and item.text:
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            elif isinstance(item, dict):
+                if "text" in item:
+                    choice.message.content = item["text"]
+                    choice.delta.content = item["text"]
+                    break


The wrapper only extracts the first text item from the response output (using break at line 77 or 82). If the responses API returns multiple output items, subsequent items are silently ignored. This could lead to incomplete responses if the API returns content in multiple parts.

Consider either concatenating all text items or documenting why only the first item is used.

Copilot · 2026-01-23T06:52:37Z

aider/models.py

+                if hasattr(item, "text"):
+                    mock_chunk.choices[0].delta.content = item.text
+                    break
+


The StreamingResponsesAPIWrapper initializes finish_reason to None but never updates it based on the chunk data. The code in base_coder.py checks for finish_reason == "length" to detect when the response is truncated (lines 1907-1911 in base_coder.py). Without populating this field from the responses API chunks, this check will never work properly for responses API streaming.

Consider extracting and setting the finish_reason from the chunk if it's available.

Suggested change

# Propagate finish_reason from the underlying chunk if available

finish_reason = None

# Check for finish_reason on delta first (most similar to chat completions)

if hasattr(chunk, "delta") and hasattr(chunk.delta, "finish_reason"):

finish_reason = chunk.delta.finish_reason

# Fallback: finish_reason directly on the chunk

elif hasattr(chunk, "finish_reason"):

finish_reason = chunk.finish_reason

# Fallback: look for finish_reason on output items

elif hasattr(chunk, "output"):

for item in chunk.output:

if hasattr(item, "finish_reason"):

finish_reason = item.finish_reason

break

if isinstance(item, dict) and "finish_reason" in item:

finish_reason = item["finish_reason"]

break

mock_chunk.choices[0].finish_reason = finish_reason

Copilot · 2026-01-23T06:52:37Z

aider/models.py

+                kwargs["instructions"] = system_messages[0].get("content", "")
+
+            # For responses API, input can be a list of messages
+            kwargs["input"] = other_messages if other_messages else messages


When no system messages exist, the fallback kwargs["input"] = other_messages if other_messages else messages could result in passing the same messages list that might contain system messages. This happens because:

If other_messages is empty (all messages were system), it falls back to messages which includes the system messages

This could cause the responses API to receive system messages in the input field even though they were filtered out

Consider using kwargs["input"] = other_messages unconditionally, or handle the empty case explicitly to avoid confusion.

Suggested change

kwargs["input"] = other_messages if other_messages else messages

kwargs["input"] = other_messages

Copilot · 2026-01-23T06:52:37Z

aider/models.py

+        # Extract content from output items
+        for item in self._responses.output:
+            if hasattr(item, "text") and item.text:
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            elif isinstance(item, dict):
+                if "text" in item:
+                    choice.message.content = item["text"]
+                    choice.delta.content = item["text"]
+                    break


The wrapper initializes tool_calls in the mock message object but never populates it from the responses API output (line 53). However, the code in send_completion can add function tools to the request (lines 1086-1089). If the responses API supports function calling and returns tool call data, it won't be extracted, potentially breaking function calling for responses API.

Verify whether the responses API supports function/tool calling, and if so, extract and populate the tool_calls field from the response output.

Suggested change

# Extract content from output items

for item in self._responses.output:

if hasattr(item, "text") and item.text:

choice.message.content = item.text

choice.delta.content = item.text

break

elif isinstance(item, dict):

if "text" in item:

choice.message.content = item["text"]

choice.delta.content = item["text"]

break

# Extract content and tool_calls from output items

for item in self._responses.output:

# Try to extract tool_calls if present on this item

if getattr(choice.message, "tool_calls", None) is None:

# Attribute-style tool_calls/tool_call

if hasattr(item, "tool_calls") and getattr(item, "tool_calls"):

choice.message.tool_calls = getattr(item, "tool_calls")

elif hasattr(item, "tool_call") and getattr(item, "tool_call"):

# Normalize single tool_call to a list for compatibility

single_call = getattr(item, "tool_call")

choice.message.tool_calls = (

single_call if isinstance(single_call, list) else [single_call]

)

# Dict-style tool_calls/tool_call

elif isinstance(item, dict):

if "tool_calls" in item and item["tool_calls"]:

choice.message.tool_calls = item["tool_calls"]

elif "tool_call" in item and item["tool_call"]:

single_call = item["tool_call"]

choice.message.tool_calls = (

single_call if isinstance(single_call, list) else [single_call]

)

# Extract text content, preserving existing behavior

if hasattr(item, "text") and getattr(item, "text", None):

choice.message.content = item.text

choice.delta.content = item.text

break

elif isinstance(item, dict) and "text" in item and item["text"]:

choice.message.content = item["text"]

choice.delta.content = item["text"]

break

- Added fastapi==0.128.0 (required by litellm.responses()) - Added orjson==3.11.5 (required by litellm for fast JSON serialization) - Maintained grpcio==1.67.0 for litellm 1.80.10 compatibility - Updated related dependency constraints These dependencies are necessary to avoid runtime import errors when using the Responses API functionality.

The OpenAI Responses API returns ResponseOutputMessage objects with a nested structure: output -> message -> content[0] -> text Updated both ResponsesAPIWrapper and StreamingResponsesAPIWrapper to: 1. Check for message type items in output 2. Extract text from nested content items 3. Include fallback for simpler response formats This fixes the 'Empty response received from LLM' error when using Responses API with Codex models.

The Responses API uses event-based streaming with OUTPUT_TEXT_DELTA events that have a 'delta' attribute containing the text, rather than nested output structures. Updated StreamingResponsesAPIWrapper to check for event.delta first. This fixes the "Empty response received from LLM" error when using streaming mode with gpt-5.1-codex and gpt-5.2-codex models. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 13 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-23T08:00:22Z

aider/models.py

+                # Handle both chat completions and responses API formats
+                wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat"))
+
+                if wire_api == "responses":
+                    # Responses API format: has 'output' instead of 'choices'
+                    if not hasattr(response, "output") or not response.output:
+                        return None
+                    # Extract text content from output items
+                    for item in response.output:
+                        if hasattr(item, "text") and item.text:
+                            res = item.text
+                            break
+                        elif isinstance(item, dict) and "text" in item:
+                            res = item["text"]
+                            break
+                    else:
+                        return None
+                else:
+                    # Chat completions API format: has 'choices'
+                    if not hasattr(response, "choices") or not response.choices:
+                        return None
+                    res = response.choices[0].message.content
+


The ResponsesAPIWrapper creates a hybrid object that has both a choices attribute (chat completions format) and falls back to the original response object for other attributes like output via getattr. This means wrapped responses support both interfaces. In simple_send_with_retries, when wire_api is "responses", the code checks response.output - which works via getattr but is confusing because send_completion already wrapped it with choices. This dual-interface approach makes the code harder to understand and maintain. Consider either: (1) using only the choices interface and having simple_send_with_retries always use response.choices regardless of wire_api, or (2) not wrapping at all and handling the format conversion at the point of consumption.

Suggested change

# Handle both chat completions and responses API formats

wire_api = os.environ.get("AIDER_WIRE_API", getattr(self, "wire_api", "chat"))

if wire_api == "responses":

# Responses API format: has 'output' instead of 'choices'

if not hasattr(response, "output") or not response.output:

return None

# Extract text content from output items

for item in response.output:

if hasattr(item, "text") and item.text:

res = item.text

break

elif isinstance(item, dict) and "text" in item:

res = item["text"]

break

else:

return None

else:

# Chat completions API format: has 'choices'

if not hasattr(response, "choices") or not response.choices:

return None

res = response.choices[0].message.content

# Use the normalized chat-completions-style interface

if not hasattr(response, "choices") or not response.choices:

return None

res = response.choices[0].message.content

Copilot · 2026-01-23T08:00:23Z

aider/models.py

+    def _convert_to_chat_format(self):
+        """Convert responses API output to chat completions choices format"""
+        if not hasattr(self._responses, "output"):
+            return


If the response object doesn't have an "output" attribute, _convert_to_chat_format returns early without initializing self.choices. This means ResponsesAPIWrapper instances might not have a choices attribute, which will cause AttributeErrors when the code tries to access response.choices elsewhere (e.g., in base_coder.py line 1850). The wrapper should always initialize self.choices to at least an empty list to ensure a consistent interface.

Copilot · 2026-01-23T08:00:23Z

aider/models.py

+        choice = MockChoice()
+
+        # Extract content from output items
+        for item in self._responses.output:
+            # Handle ResponseOutputMessage type
+            if hasattr(item, "type") and item.type == "message":
+                if hasattr(item, "content") and item.content:
+                    # Extract text from content items
+                    for content_item in item.content:
+                        if hasattr(content_item, "text") and content_item.text:
+                            choice.message.content = content_item.text
+                            choice.delta.content = content_item.text
+                            break
+                    if choice.message.content:
+                        break
+            # Fallback: direct text attribute
+            elif hasattr(item, "text") and item.text:
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            # Fallback: dict format
+            elif isinstance(item, dict):
+                if "text" in item:
+                    choice.message.content = item["text"]
+                    choice.delta.content = item["text"]
+                    break
+
+        self.choices = [choice]


The wrapper creates mock objects with content initialized to None, but doesn't validate that content was successfully extracted before creating the choices list. If no valid content is found in any output item (all branches fail or items are empty), choice.message.content and choice.delta.content will remain None. While this might be acceptable for some edge cases, it could lead to unexpected None values propagating through the codebase. Consider initializing content to an empty string instead of None for better consistency with the chat completions API behavior.

Copilot · 2026-01-23T08:00:23Z

aider/models.py

            self.github_copilot_token_to_open_ai_key(kwargs["extra_headers"])

-        res = litellm.completion(**kwargs)
+        # Use responses API or chat completions API based on wire_api setting


The code doesn't validate the wire_api value. If AIDER_WIRE_API or the model's wire_api setting contains an invalid value (e.g., "foo"), the code will silently default to using the chat completions API because the if condition on line 1141 only checks for "responses". This could lead to confusion if users mistype the value. Consider adding validation to raise a clear error for invalid wire_api values, or at least log a warning when an unrecognized value is encountered.

Suggested change

# Use responses API or chat completions API based on wire_api setting

# Use responses API or chat completions API based on wire_api setting

# Validate wire_api to catch misconfigurations (e.g., typos)

valid_wire_api_values = {None, "", "responses"}

if wire_api not in valid_wire_api_values:

sys.stderr.write(

f"Warning: Unrecognized wire_api value '{wire_api}'. "

"Falling back to chat completions API.\n"

)

Copilot · 2026-01-23T08:00:24Z

aider/models.py

+                            choice.delta.content = content_item.text
+                            break
+                    if choice.message.content:
+                        break
+            # Fallback: direct text attribute
+            elif hasattr(item, "text") and item.text:
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            # Fallback: dict format
+            elif isinstance(item, dict):
+                if "text" in item:
+                    choice.message.content = item["text"]
+                    choice.delta.content = item["text"]


In ResponsesAPIWrapper, both message.content and delta.content are set to the same value (lines 80-81, 87-88, 93-94). The delta attribute is typically used for streaming responses, while message is for non-streaming. Setting both for a non-streaming response wrapper is unconventional and could cause confusion. For non-streaming responses, only message.content should be set, with delta.content remaining None or unset to match the standard chat completions API behavior.

Suggested change

choice.delta.content = content_item.text

break

if choice.message.content:

break

# Fallback: direct text attribute

elif hasattr(item, "text") and item.text:

choice.message.content = item.text

choice.delta.content = item.text

break

# Fallback: dict format

elif isinstance(item, dict):

if "text" in item:

choice.message.content = item["text"]

choice.delta.content = item["text"]

break

if choice.message.content:

break

# Fallback: direct text attribute

elif hasattr(item, "text") and item.text:

choice.message.content = item.text

break

# Fallback: dict format

elif isinstance(item, dict):

if "text" in item:

choice.message.content = item["text"]

Copilot · 2026-01-23T08:00:24Z

aider/models.py

+    def __next__(self):
+        event = next(self._stream)
+
+        # Wrap each event to look like chat completions format
+        class MockChoice:
+            def __init__(self):
+                self.delta = type(
+                    "obj",
+                    (object,),
+                    {
+                        "content": None,
+                        "function_call": None,
+                        "reasoning_content": None,
+                        "reasoning": None,
+                    },
+                )()
+                self.finish_reason = None
+
+        class MockChunk:
+            def __init__(self):
+                self.choices = [MockChoice()]
+
+        mock_chunk = MockChunk()
+
+        # Handle Responses API event stream format
+        # Check for OUTPUT_TEXT_DELTA events (have delta attribute with text)
+        if hasattr(event, "delta") and event.delta:
+            mock_chunk.choices[0].delta.content = event.delta
+        # Fallback for other formats
+        elif hasattr(event, "output"):
+            for item in event.output:
+                # Handle ResponseOutputMessage type
+                if hasattr(item, "type") and item.type == "message":
+                    if hasattr(item, "content") and item.content:
+                        for content_item in item.content:
+                            if hasattr(content_item, "text") and content_item.text:
+                                mock_chunk.choices[0].delta.content = content_item.text
+                                break
+                        if mock_chunk.choices[0].delta.content:
+                            break
+                # Fallback: direct text attribute
+                elif hasattr(item, "text"):
+                    mock_chunk.choices[0].delta.content = item.text
+                    break
+
+        return mock_chunk


If an event in the streaming response has neither a "delta" attribute nor an "output" attribute, mock_chunk.choices[0].delta.content will remain None. While this might be intentional for some event types, the code doesn't handle StopIteration when the underlying stream ends. The next method should properly handle stream termination and raise StopIteration when appropriate to prevent infinite loops or unexpected behavior in consuming code.

Copilot · 2026-01-23T08:00:24Z

aider/models.py

+                kwargs["instructions"] = system_messages[0].get("content", "")
+
+            # For responses API, input can be a list of messages
+            kwargs["input"] = other_messages if other_messages else messages


When there are no non-system messages, the code sets kwargs["input"] = messages (which includes system messages). However, if system messages were found, they're already extracted into kwargs["instructions"], resulting in system messages being sent twice - once in "instructions" and once in "input". The logic should ensure system messages are excluded from "input" when they've been extracted to "instructions".

Suggested change

kwargs["input"] = other_messages if other_messages else messages

# Avoid sending system messages twice: once as instructions and again as input.

if system_messages:

# When system messages are present, only send non-system messages as input

kwargs["input"] = other_messages

else:

# When there are no system messages, preserve existing behavior

kwargs["input"] = other_messages if other_messages else messages

Copilot · 2026-01-23T08:00:25Z

aider/models.py

+class ResponsesAPIWrapper:
+    """Wrapper to convert Responses API format to Chat Completions format"""
+
+    def __init__(self, responses_obj):
+        self._responses = responses_obj
+        self._convert_to_chat_format()
+
+    def _convert_to_chat_format(self):
+        """Convert responses API output to chat completions choices format"""
+        if not hasattr(self._responses, "output"):
+            return
+
+        # Create a mock choices structure
+        class MockChoice:
+            def __init__(self):
+                self.message = type(
+                    "obj",
+                    (object,),
+                    {
+                        "content": None,
+                        "tool_calls": None,
+                        "reasoning_content": None,
+                        "reasoning": None,
+                    },
+                )()
+                self.delta = type(
+                    "obj",
+                    (object,),
+                    {
+                        "content": None,
+                        "function_call": None,
+                        "reasoning_content": None,
+                        "reasoning": None,
+                    },
+                )()
+                self.finish_reason = None
+
+        choice = MockChoice()
+
+        # Extract content from output items
+        for item in self._responses.output:
+            # Handle ResponseOutputMessage type
+            if hasattr(item, "type") and item.type == "message":
+                if hasattr(item, "content") and item.content:
+                    # Extract text from content items
+                    for content_item in item.content:
+                        if hasattr(content_item, "text") and content_item.text:
+                            choice.message.content = content_item.text
+                            choice.delta.content = content_item.text
+                            break
+                    if choice.message.content:
+                        break
+            # Fallback: direct text attribute
+            elif hasattr(item, "text") and item.text:
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            # Fallback: dict format
+            elif isinstance(item, dict):
+                if "text" in item:
+                    choice.message.content = item["text"]
+                    choice.delta.content = item["text"]
+                    break
+
+        self.choices = [choice]
+
+        # Copy other attributes
+        if hasattr(self._responses, "id"):
+            self.id = self._responses.id
+        if hasattr(self._responses, "usage"):
+            self.usage = self._responses.usage
+
+    def __getattr__(self, name):
+        """Fallback to original responses object for other attributes"""
+        return getattr(self._responses, name)
+
+
+class StreamingResponsesAPIWrapper:
+    """Wrapper for streaming responses API to mimic chat completions stream"""
+
+    def __init__(self, responses_stream):
+        self._stream = responses_stream
+
+    def __iter__(self):
+        return self
+
+    def __next__(self):
+        event = next(self._stream)
+
+        # Wrap each event to look like chat completions format
+        class MockChoice:
+            def __init__(self):
+                self.delta = type(
+                    "obj",
+                    (object,),
+                    {
+                        "content": None,
+                        "function_call": None,
+                        "reasoning_content": None,
+                        "reasoning": None,
+                    },
+                )()
+                self.finish_reason = None
+
+        class MockChunk:
+            def __init__(self):
+                self.choices = [MockChoice()]
+
+        mock_chunk = MockChunk()
+
+        # Handle Responses API event stream format
+        # Check for OUTPUT_TEXT_DELTA events (have delta attribute with text)
+        if hasattr(event, "delta") and event.delta:
+            mock_chunk.choices[0].delta.content = event.delta
+        # Fallback for other formats
+        elif hasattr(event, "output"):
+            for item in event.output:
+                # Handle ResponseOutputMessage type
+                if hasattr(item, "type") and item.type == "message":
+                    if hasattr(item, "content") and item.content:
+                        for content_item in item.content:
+                            if hasattr(content_item, "text") and content_item.text:
+                                mock_chunk.choices[0].delta.content = content_item.text
+                                break
+                        if mock_chunk.choices[0].delta.content:
+                            break
+                # Fallback: direct text attribute
+                elif hasattr(item, "text"):
+                    mock_chunk.choices[0].delta.content = item.text
+                    break
+
+        return mock_chunk
+


The new ResponsesAPIWrapper and StreamingResponsesAPIWrapper classes, as well as the responses API integration in send_completion and simple_send_with_retries, lack test coverage. Given that the repository has comprehensive test coverage for other model functionality (see tests/basic/test_models.py and tests/basic/test_sendchat.py), tests should be added to verify: (1) ResponsesAPIWrapper correctly converts responses API format to chat completions format, (2) StreamingResponsesAPIWrapper properly handles streaming events, (3) send_completion routes to the correct API based on wire_api setting, and (4) simple_send_with_retries correctly extracts responses from both API formats.

Copilot · 2026-01-23T08:00:25Z

aider/models.py

    remove_reasoning: Optional[str] = None  # Deprecated alias for reasoning_tag
    system_prompt_prefix: Optional[str] = None
    accepts_settings: Optional[list] = None
+    wire_api: str = "chat"  # "chat" for Chat Completions API, "responses" for Responses API


The new AIDER_WIRE_API environment variable and wire_api model setting are not documented. Users need to know: (1) the purpose of this setting, (2) valid values ("chat" or "responses"), (3) when to use each option, and (4) that the environment variable overrides the model setting. Consider adding documentation in the appropriate user-facing documentation files or adding docstring documentation to the ModelSettings class explaining the wire_api field.

Copilot · 2026-01-23T08:00:25Z

requirements/requirements.in

 pypandoc
 litellm
+fastapi  # Required by litellm.responses() API
+orjson   # Required by litellm for fast JSON serialization


The comment states that orjson is "Required by litellm for fast JSON serialization", but this is not entirely accurate. While orjson may be used by litellm if available, it's not clear if it's strictly required or just optional. The comment should clarify whether this is a hard requirement for litellm.responses() support or if it's optional. If it's optional, the comment should indicate what functionality is lost without it.

Suggested change

orjson # Required by litellm for fast JSON serialization

orjson # Optional: used by litellm for faster JSON serialization; without it, standard json is used and may be slower

- Always initialize self.choices in ResponsesAPIWrapper for consistent interface - Concatenate multiple system messages instead of dropping them - Simplify simple_send_with_retries to use unified choices interface - Add finish_reason extraction in StreamingResponsesAPIWrapper for truncation detection - Add wire_api value validation with warning for invalid values - Initialize content to empty string instead of None - Remove delta.content from non-streaming wrapper (only for streaming) - Properly handle StopIteration in streaming wrapper - Improve dependency comments Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings January 23, 2026 06:45

Copilot started reviewing on behalf of wzh4464 January 23, 2026 06:46 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

wzh4464 and others added 3 commits January 23, 2026 15:02

wzh4464 requested a review from Copilot January 23, 2026 07:51

Copilot started reviewing on behalf of wzh4464 January 23, 2026 07:51 View session

Copilot AI reviewed Jan 23, 2026

View reviewed changes

wzh4464 force-pushed the add-responses-api-support branch from 13d0fe8 to 0a4044d Compare January 23, 2026 08:14

-                kwargs["instructions"] = system_messages[0].get("content", "")
+                # Concatenate all system message contents to avoid dropping context
+                instructions_parts = [
+                    str(msg.get("content", ""))
+                    for msg in system_messages
+                    if msg.get("content")
+                ]
+                if instructions_parts:
+                    kwargs["instructions"] = "\n\n".join(instructions_parts)

+        # Propagate finish_reason from the underlying chunk if available
+        finish_reason = None
+        # Check for finish_reason on delta first (most similar to chat completions)
+        if hasattr(chunk, "delta") and hasattr(chunk.delta, "finish_reason"):
+            finish_reason = chunk.delta.finish_reason
+        # Fallback: finish_reason directly on the chunk
+        elif hasattr(chunk, "finish_reason"):
+            finish_reason = chunk.finish_reason
+        # Fallback: look for finish_reason on output items
+        elif hasattr(chunk, "output"):
+            for item in chunk.output:
+                if hasattr(item, "finish_reason"):
+                    finish_reason = item.finish_reason
+                    break
+                if isinstance(item, dict) and "finish_reason" in item:
+                    finish_reason = item["finish_reason"]
+                    break
+        mock_chunk.choices[0].finish_reason = finish_reason

	kwargs["input"] = other_messages if other_messages else messages
	kwargs["input"] = other_messages

-        # Extract content from output items
-        for item in self._responses.output:
-            if hasattr(item, "text") and item.text:
-                choice.message.content = item.text
-                choice.delta.content = item.text
-                break
-            elif isinstance(item, dict):
-                if "text" in item:
-                    choice.message.content = item["text"]
-                    choice.delta.content = item["text"]
-                    break
+        # Extract content and tool_calls from output items
+        for item in self._responses.output:
+            # Try to extract tool_calls if present on this item
+            if getattr(choice.message, "tool_calls", None) is None:
+                # Attribute-style tool_calls/tool_call
+                if hasattr(item, "tool_calls") and getattr(item, "tool_calls"):
+                    choice.message.tool_calls = getattr(item, "tool_calls")
+                elif hasattr(item, "tool_call") and getattr(item, "tool_call"):
+                    # Normalize single tool_call to a list for compatibility
+                    single_call = getattr(item, "tool_call")
+                    choice.message.tool_calls = (
+                        single_call if isinstance(single_call, list) else [single_call]
+                    )
+                # Dict-style tool_calls/tool_call
+                elif isinstance(item, dict):
+                    if "tool_calls" in item and item["tool_calls"]:
+                        choice.message.tool_calls = item["tool_calls"]
+                    elif "tool_call" in item and item["tool_call"]:
+                        single_call = item["tool_call"]
+                        choice.message.tool_calls = (
+                            single_call if isinstance(single_call, list) else [single_call]
+                        )
+            # Extract text content, preserving existing behavior
+            if hasattr(item, "text") and getattr(item, "text", None):
+                choice.message.content = item.text
+                choice.delta.content = item.text
+                break
+            elif isinstance(item, dict) and "text" in item and item["text"]:
+                choice.message.content = item["text"]
+                choice.delta.content = item["text"]
+                break

-        # Use responses API or chat completions API based on wire_api setting
+        # Use responses API or chat completions API based on wire_api setting
+        # Validate wire_api to catch misconfigurations (e.g., typos)
+        valid_wire_api_values = {None, "", "responses"}
+        if wire_api not in valid_wire_api_values:
+            sys.stderr.write(
+                f"Warning: Unrecognized wire_api value '{wire_api}'. "
+                "Falling back to chat completions API.\n"
+            )

	orjson # Required by litellm for fast JSON serialization
	orjson # Optional: used by litellm for faster JSON serialization; without it, standard json is used and may be slower

Conversation

wzh4464 commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Core Implementation

Dependencies

Streaming Fix (commit 4ee8c9b)

Usage

Testing

Fixes

Notes

Known Issue (Not in Scope)

CLA

Uh oh!

CLAassistant commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

wzh4464 commented Jan 23, 2026 •

edited

Loading

Streaming Fix (commit `4ee8c9b`)

CLAassistant commented Jan 23, 2026 •

edited

Loading