Skip to content

Merge from main#132

Merged
contextablemark merged 4 commits intoContextable:mainfrom
ag-ui-protocol:main
Feb 2, 2026
Merged

Merge from main#132
contextablemark merged 4 commits intoContextable:mainfrom
ag-ui-protocol:main

Conversation

@contextablemark
Copy link
Member

No description provided.

contextablemark and others added 4 commits January 31, 2026 18:18
…umable agents (#1011)

* feat(adk-middleware): robust streaming function call arguments support

Refactor the streaming function call dispatch in EventTranslator to
fix several correctness issues and add opt-in support for Gemini 3's
stream_function_call_arguments mode.

Closes #990

## Breaking Changes

None. All changes are backwards-compatible. The new
`streaming_function_call_arguments` parameter defaults to False.

## What changed

### Streaming FC dispatch split into two explicit modes (#990)

The `is_streaming_fc` condition previously used `func_call.name and
will_continue` which was too broad — it matched partial events that
should be skipped (no accumulated args yet). The dispatch is now split:

- **Mode A** (Gemini 3+ `stream_function_call_arguments`): only active
  when `streaming_function_call_arguments=True` is passed to
  EventTranslator/ADKAgent. Triggers on `partial_args`, first chunk
  (name + will_continue + no args), or end chunk.

- **Mode B** (accumulated args / progressive SSE): triggers on
  `has_args` with `will_continue`, existing tracking, or named FC in
  a partial event. This is the original behavior.

### Name-based dedup replaced with single-use tracking (#990)

`_completed_streaming_fc_names` (a permanent set) suppressed repeat
invocations of the same tool. Replaced with
`_last_completed_streaming_fc_name` (Optional[str]) that clears after
the non-partial event is filtered. TOOL_CALL_RESULT suppression uses
the same single-use mechanism with None guards.

### ADK aggregator workarounds for stream_function_call_arguments

The predictive_state_updates example includes monkey-patches for two
ADK bugs that prevent streaming FC args from working out of the box.
Filed upstream as google/adk-python#4311 —
workaround code is annotated so it can be removed when the fix ships.

### Examples bumped to google-adk>=1.23.0

The examples pyproject.toml now pins `google-adk>=1.23.0` to test
against the latest ADK. The library minimum remains `>=1.16.0`.

### Dojo e2e test comment updated

The predictive_state_updates e2e test remains skipped but the comment
now explains that the demo works without Vertex AI (falls back to
Gemini 2.5 Flash) and documents what credentials enable full streaming.

## Test results

530 passed, 33 skipped (full suite).
3 new tests in test_lro_filtering.py:
- test_mode_a_streaming_fc_with_flag_enabled
- test_mode_a_first_chunk_skipped_without_flag
- test_same_tool_called_twice_not_suppressed

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): remap confirmed FC ids to streaming ids in EventTranslator

With PROGRESSIVE_SSE_STREAMING (default since ADK 1.22.0), partial and
confirmed events for the same function call carry different ADK-generated
ids. The EventTranslator was emitting TOOL_CALL_START/END with the
partial id, but ToolCallResultEvent used the confirmed id. This caused
_start_new_execution's tool_call_id tracking to never match, so backend
tool results were never marked as processed — breaking replay filtering
(test_skip_summarization_replay_scenario).

Fix: when the confirmed (non-partial) event is filtered out (because
the streaming path already emitted it), record a mapping from the
confirmed id to the streaming id. _translate_function_response then
remaps the ToolCallResultEvent's tool_call_id to match.

Also adds two live integration tests for streaming function call
arguments via Gemini 3 Pro Preview (skipped without Vertex AI creds):
- test_streaming_fc_emits_incremental_tool_call_args
- test_streaming_fc_tool_call_ids_consistent_across_result

Test results: 565 passed, 0 failed.

Addresses #990

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Updated files.json.

* fix(adk-middleware): prevent duplicate TOOL_CALL emission for client-side tools with ResumabilityConfig

With ResumabilityConfig, ADK emits client-side function calls from up to
three sources with potentially different IDs: the LRO event, a confirmed
non-partial event, and the ClientProxyTool execution. This caused the
frontend to render tool call results (e.g., HITL task lists) multiple times.

The fix introduces three layers of deduplication:
- client_tool_names: EventTranslator skips all function calls for tools
  owned by ClientProxyTool, regardless of ID
- client_emitted_tool_call_ids: shared set for ID-based dedup between
  proxy and translator
- translator.emitted_tool_call_ids: proxy skips if translator already
  emitted (fallback for non-resumable flows)

Adds 12 regression tests covering LRO, confirmed, partial, mixed tool
call scenarios, and the full end-to-end resumable HITL flow.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): clear invocation_id after completed HITL resume and flex step count

- Clear stored invocation_id after a resumable run completes successfully,
  preventing subsequent new runs from erroneously attempting HITL resumption
  with a stale ID (which produced no output)
- Update HITL example prompt to respect user-requested step count instead
  of always generating exactly 10 steps

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Updating files.json for latest agent changes.

* docs(adk-middleware): add ResumabilityConfig usage guide for HITL workflows

Document ADKAgent.from_app() with ResumabilityConfig for human-in-the-loop
workflows, including requirements (google-adk >= 1.16.0), how it works,
and a comparison table vs direct ADKAgent usage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(adk-middleware): extract Gemini 3 workarounds into reusable module

Move thought-signature repair callback from the predictive_state_updates
example into src/ag_ui_adk/workarounds.py so the middleware auto-injects
it as a before_model_callback when streaming_function_call_arguments=True.

The aggregator patch (apply_aggregator_patch) is also extracted but NOT
auto-applied — it conflicts with the event translator's Mode A streaming.
The example still applies it explicitly when needed.

Also switches the example model from gemini-3-pro-preview to
gemini-3-flash-preview (required for stream_function_call_arguments).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore(adk-middleware): use gemini-3-flash-preview consistently

Update USAGE.md and integration test to reference gemini-3-flash-preview
instead of gemini-3-pro-preview.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): strengthen HITL instruction for disabled step handling

Clarify that disabled steps are permanently deleted from the plan so the
agent answers "No" when asked whether a disabled step is included.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore(dojo): update files.json for latest agent changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): fix multi-turn HITL sessions producing empty responses

Two issues caused the second message in a session to produce no output:

1. The HITL instruction used "ALWAYS call generate_task_steps" which made
   the model call the tool even on greetings, creating a false HITL pause
   that blocked subsequent messages. Changed to only call on actual task
   requests.

2. The invocation_id was stored on every run for HITL resumption but only
   cleared when a previous stored ID already existed. On a normal first
   run (no HITL pause), the ID was stored but never cleared, causing the
   second run to attempt resumption of a completed invocation — producing
   no output. Fixed by also clearing when the ID was newly stored this
   run and there's no LRO tool pause.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(adk-middleware): auto-apply aggregator patch when streaming FC args enabled

- ADKAgent.__init__ now calls apply_aggregator_patch() when
  streaming_function_call_arguments=True, so callers no longer need to
  apply it manually.
- Remove manual apply_aggregator_patch() from predictive_state_updates example.
- Fix streaming FC args integration test: filter TOOL_CALL_ARGS assertions
  by tool name to exclude the synthetic confirm_changes tool call, and
  remove unnecessary mode="ANY" from FunctionCallingConfig.
- Update workarounds docstring and test to reflect auto-apply behavior.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(adk-middleware): add streaming_function_call_arguments to from_app()

The parameter was missing from the from_app() classmethod signature and
cls() call, so callers using App-based construction couldn't enable
streaming function call arguments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): decouple is_long_running_tool from translator event emission

Client tool dedup filtering in translate_lro_function_calls causes no
TOOL_CALL_END to be emitted for client tools, leaving is_long_running_tool
as False. This clears the stored invocation_id after the run, breaking
SequentialAgent HITL resumption. Set the flag directly from
has_lro_function_call before calling the translator.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): bypass client_tool_names filter for streaming FC args

When streaming_function_call_arguments=True, client tool partial chunks
were filtered out by client_tool_names before reaching Mode A detection.
This caused all streaming chunks to be dropped, with only a single bulk
emission from ClientProxyTool. Skip the client_tool_names filter on
partial events when streaming FC args is enabled so the translator can
stream args incrementally.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): handle nameless streaming FC chunks via deferred flush

ADK's populate_client_function_call_id assigns a fresh adk-<uuid> to
every partial event and never propagates the tool name to partial
chunks. This broke Mode A streaming detection which required
func_call.name on the first chunk.

Buffer TOOL_CALL_ARGS/END events when the name is unknown and flush
them (START + buffered ARGS + END) when the complete (non-partial)
event supplies the real tool name. Map confirmed→streaming IDs so
function responses use the correct tool_call_id.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): stream nameless FC chunks immediately with name inference

Replace the buffering/deferred-flush approach (which batched all events
and defeated streaming) with immediate emission. The first nameless
chunk defers only TOOL_CALL_START until partial_args arrive, then infers
the tool name via json_path matching against client_tool_schemas and
emits START + ARGS in real time. Subsequent chunks stream ARGS
immediately.

Name inference strategy:
- Single client tool: use it directly (unambiguous)
- Multiple tools: match partial_args json_paths against tool argument
  schemas (client_tool_schemas: Dict[str, Set[str]])
- Fallback: empty string (protocol-valid)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): add confirmed FC id to emitted_tool_call_ids for ClientProxyTool dedup

After streaming completes, the complete event's confirmed FC id (which
differs from the streaming id) is mapped but not added to
emitted_tool_call_ids. ClientProxyTool receives the confirmed id and
doesn't find it in the set, so it emits duplicate TOOL_CALL events.

Add the confirmed id to emitted_tool_call_ids when recording the
confirmed→streaming id mapping.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): suppress confirmed FC duplicate when streaming resolved_name is falsy

When streaming FC args complete with an empty/falsy resolved_name (tool
not in client_tool_names), _last_completed_streaming_fc_name was never
set, so the confirmed event's FC passed through all filters causing
duplicate TOOL_CALL emissions. Use _pending_streaming_completion_id to
unconditionally suppress the first confirmed FC after streaming completes
and map its ID to the streaming ID for consistent function responses.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(adk-middleware): stream FC args for opted-in LRO/HITL tools

Allow specific LRO tools to stream their args incrementally via
TOOL_CALL_ARGS events while still pausing for user input. Tools opt in
via `stream_tool_call=True` on PredictStateMapping. The streaming END
is deferred until the confirmed LRO event arrives, at which point the
PredictState CustomEvent and TOOL_CALL_END are emitted together.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): skip client_tool_names filter for non-resumable agents and filter backend tools from streaming FC

Two fixes:
1. Add is_resumable flag to EventTranslator so the client_tool_names
   filter in translate_lro_function_calls only applies for resumable
   agents (where ClientProxyTool handles emission). Non-resumable agents
   (agentic chat, haiku) now correctly emit tool call events via the LRO path.

2. Filter out backend tools (e.g. google_search) from streaming FC args
   when streaming_function_call_arguments is enabled, so ADK can execute
   them server-side without interference.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(adk-middleware): persist FunctionCall on early return and emit terminal events for synthetic tool results

Two fixes:

1. When the aggregator monkey-patch (apply_aggregator_patch) is active globally,
   ADK yields FunctionCall events as partial/streaming events that aren't persisted
   to the session. Non-resumable agents that return early at LRO detection leave
   the session without the FunctionCall, causing "No function call event found" on
   the next run. Fix: manually persist the FunctionCall event before early return.

2. When confirm_changes (synthetic) tool results have no trailing messages,
   _handle_tool_result_submission returned without yielding any events, producing
   an empty SSE stream that triggers INCOMPLETE_STREAM on the client. Fix: emit
   RUN_STARTED + RUN_FINISHED for a valid terminal stream.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* refactor(adk-middleware): remove all streaming function call arguments logic

Remove Mode A (Gemini 3+ stream_function_call_arguments) and Mode B
(accumulated args delta) streaming infrastructure. The upstream ADK bug
(google/adk-python#4311) makes this unreliable; a resurrection document
is included for when the fix lands.

- Delete workarounds.py (aggregator patch, thought-signature repair)
- Remove streaming state vars and methods from EventTranslator
- Remove stream_tool_call from PredictStateMapping
- Simplify predictive_state_updates example
- Add STREAMING_FC_ARGS_RECONSTRUCTION.md for future re-implementation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* deprecate(adk-middleware): warn on non-resumable HITL flow, recommend from_app() with ResumabilityConfig

The fire-and-forget HITL path via ADKAgent(adk_agent=...) is now deprecated
for human-in-the-loop workflows. A DeprecationWarning is emitted at runtime
when the old-style early-return is triggered. The direct constructor remains
fully supported for agents without client-side tools.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* chore: release core packages

* chore: release mastra sdk
@contextablemark contextablemark merged commit 6bbe52c into Contextable:main Feb 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants