Merge from main by contextablemark · Pull Request #132 · Contextable/ag-ui

contextablemark · 2026-02-02T15:28:58Z

No description provided.

…umable agents (#1011) * feat(adk-middleware): robust streaming function call arguments support Refactor the streaming function call dispatch in EventTranslator to fix several correctness issues and add opt-in support for Gemini 3's stream_function_call_arguments mode. Closes #990 ## Breaking Changes None. All changes are backwards-compatible. The new `streaming_function_call_arguments` parameter defaults to False. ## What changed ### Streaming FC dispatch split into two explicit modes (#990) The `is_streaming_fc` condition previously used `func_call.name and will_continue` which was too broad — it matched partial events that should be skipped (no accumulated args yet). The dispatch is now split: - **Mode A** (Gemini 3+ `stream_function_call_arguments`): only active when `streaming_function_call_arguments=True` is passed to EventTranslator/ADKAgent. Triggers on `partial_args`, first chunk (name + will_continue + no args), or end chunk. - **Mode B** (accumulated args / progressive SSE): triggers on `has_args` with `will_continue`, existing tracking, or named FC in a partial event. This is the original behavior. ### Name-based dedup replaced with single-use tracking (#990) `_completed_streaming_fc_names` (a permanent set) suppressed repeat invocations of the same tool. Replaced with `_last_completed_streaming_fc_name` (Optional[str]) that clears after the non-partial event is filtered. TOOL_CALL_RESULT suppression uses the same single-use mechanism with None guards. ### ADK aggregator workarounds for stream_function_call_arguments The predictive_state_updates example includes monkey-patches for two ADK bugs that prevent streaming FC args from working out of the box. Filed upstream as google/adk-python#4311 — workaround code is annotated so it can be removed when the fix ships. ### Examples bumped to google-adk>=1.23.0 The examples pyproject.toml now pins `google-adk>=1.23.0` to test against the latest ADK. The library minimum remains `>=1.16.0`. ### Dojo e2e test comment updated The predictive_state_updates e2e test remains skipped but the comment now explains that the demo works without Vertex AI (falls back to Gemini 2.5 Flash) and documents what credentials enable full streaming. ## Test results 530 passed, 33 skipped (full suite). 3 new tests in test_lro_filtering.py: - test_mode_a_streaming_fc_with_flag_enabled - test_mode_a_first_chunk_skipped_without_flag - test_same_tool_called_twice_not_suppressed Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): remap confirmed FC ids to streaming ids in EventTranslator With PROGRESSIVE_SSE_STREAMING (default since ADK 1.22.0), partial and confirmed events for the same function call carry different ADK-generated ids. The EventTranslator was emitting TOOL_CALL_START/END with the partial id, but ToolCallResultEvent used the confirmed id. This caused _start_new_execution's tool_call_id tracking to never match, so backend tool results were never marked as processed — breaking replay filtering (test_skip_summarization_replay_scenario). Fix: when the confirmed (non-partial) event is filtered out (because the streaming path already emitted it), record a mapping from the confirmed id to the streaming id. _translate_function_response then remaps the ToolCallResultEvent's tool_call_id to match. Also adds two live integration tests for streaming function call arguments via Gemini 3 Pro Preview (skipped without Vertex AI creds): - test_streaming_fc_emits_incremental_tool_call_args - test_streaming_fc_tool_call_ids_consistent_across_result Test results: 565 passed, 0 failed. Addresses #990 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Updated files.json. * fix(adk-middleware): prevent duplicate TOOL_CALL emission for client-side tools with ResumabilityConfig With ResumabilityConfig, ADK emits client-side function calls from up to three sources with potentially different IDs: the LRO event, a confirmed non-partial event, and the ClientProxyTool execution. This caused the frontend to render tool call results (e.g., HITL task lists) multiple times. The fix introduces three layers of deduplication: - client_tool_names: EventTranslator skips all function calls for tools owned by ClientProxyTool, regardless of ID - client_emitted_tool_call_ids: shared set for ID-based dedup between proxy and translator - translator.emitted_tool_call_ids: proxy skips if translator already emitted (fallback for non-resumable flows) Adds 12 regression tests covering LRO, confirmed, partial, mixed tool call scenarios, and the full end-to-end resumable HITL flow. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): clear invocation_id after completed HITL resume and flex step count - Clear stored invocation_id after a resumable run completes successfully, preventing subsequent new runs from erroneously attempting HITL resumption with a stale ID (which produced no output) - Update HITL example prompt to respect user-requested step count instead of always generating exactly 10 steps Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Updating files.json for latest agent changes. * docs(adk-middleware): add ResumabilityConfig usage guide for HITL workflows Document ADKAgent.from_app() with ResumabilityConfig for human-in-the-loop workflows, including requirements (google-adk >= 1.16.0), how it works, and a comparison table vs direct ADKAgent usage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(adk-middleware): extract Gemini 3 workarounds into reusable module Move thought-signature repair callback from the predictive_state_updates example into src/ag_ui_adk/workarounds.py so the middleware auto-injects it as a before_model_callback when streaming_function_call_arguments=True. The aggregator patch (apply_aggregator_patch) is also extracted but NOT auto-applied — it conflicts with the event translator's Mode A streaming. The example still applies it explicitly when needed. Also switches the example model from gemini-3-pro-preview to gemini-3-flash-preview (required for stream_function_call_arguments). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(adk-middleware): use gemini-3-flash-preview consistently Update USAGE.md and integration test to reference gemini-3-flash-preview instead of gemini-3-pro-preview. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): strengthen HITL instruction for disabled step handling Clarify that disabled steps are permanently deleted from the plan so the agent answers "No" when asked whether a disabled step is included. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore(dojo): update files.json for latest agent changes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): fix multi-turn HITL sessions producing empty responses Two issues caused the second message in a session to produce no output: 1. The HITL instruction used "ALWAYS call generate_task_steps" which made the model call the tool even on greetings, creating a false HITL pause that blocked subsequent messages. Changed to only call on actual task requests. 2. The invocation_id was stored on every run for HITL resumption but only cleared when a previous stored ID already existed. On a normal first run (no HITL pause), the ID was stored but never cleared, causing the second run to attempt resumption of a completed invocation — producing no output. Fixed by also clearing when the ID was newly stored this run and there's no LRO tool pause. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(adk-middleware): auto-apply aggregator patch when streaming FC args enabled - ADKAgent.__init__ now calls apply_aggregator_patch() when streaming_function_call_arguments=True, so callers no longer need to apply it manually. - Remove manual apply_aggregator_patch() from predictive_state_updates example. - Fix streaming FC args integration test: filter TOOL_CALL_ARGS assertions by tool name to exclude the synthetic confirm_changes tool call, and remove unnecessary mode="ANY" from FunctionCallingConfig. - Update workarounds docstring and test to reflect auto-apply behavior. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(adk-middleware): add streaming_function_call_arguments to from_app() The parameter was missing from the from_app() classmethod signature and cls() call, so callers using App-based construction couldn't enable streaming function call arguments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): decouple is_long_running_tool from translator event emission Client tool dedup filtering in translate_lro_function_calls causes no TOOL_CALL_END to be emitted for client tools, leaving is_long_running_tool as False. This clears the stored invocation_id after the run, breaking SequentialAgent HITL resumption. Set the flag directly from has_lro_function_call before calling the translator. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): bypass client_tool_names filter for streaming FC args When streaming_function_call_arguments=True, client tool partial chunks were filtered out by client_tool_names before reaching Mode A detection. This caused all streaming chunks to be dropped, with only a single bulk emission from ClientProxyTool. Skip the client_tool_names filter on partial events when streaming FC args is enabled so the translator can stream args incrementally. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): handle nameless streaming FC chunks via deferred flush ADK's populate_client_function_call_id assigns a fresh adk-<uuid> to every partial event and never propagates the tool name to partial chunks. This broke Mode A streaming detection which required func_call.name on the first chunk. Buffer TOOL_CALL_ARGS/END events when the name is unknown and flush them (START + buffered ARGS + END) when the complete (non-partial) event supplies the real tool name. Map confirmed→streaming IDs so function responses use the correct tool_call_id. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): stream nameless FC chunks immediately with name inference Replace the buffering/deferred-flush approach (which batched all events and defeated streaming) with immediate emission. The first nameless chunk defers only TOOL_CALL_START until partial_args arrive, then infers the tool name via json_path matching against client_tool_schemas and emits START + ARGS in real time. Subsequent chunks stream ARGS immediately. Name inference strategy: - Single client tool: use it directly (unambiguous) - Multiple tools: match partial_args json_paths against tool argument schemas (client_tool_schemas: Dict[str, Set[str]]) - Fallback: empty string (protocol-valid) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): add confirmed FC id to emitted_tool_call_ids for ClientProxyTool dedup After streaming completes, the complete event's confirmed FC id (which differs from the streaming id) is mapped but not added to emitted_tool_call_ids. ClientProxyTool receives the confirmed id and doesn't find it in the set, so it emits duplicate TOOL_CALL events. Add the confirmed id to emitted_tool_call_ids when recording the confirmed→streaming id mapping. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): suppress confirmed FC duplicate when streaming resolved_name is falsy When streaming FC args complete with an empty/falsy resolved_name (tool not in client_tool_names), _last_completed_streaming_fc_name was never set, so the confirmed event's FC passed through all filters causing duplicate TOOL_CALL emissions. Use _pending_streaming_completion_id to unconditionally suppress the first confirmed FC after streaming completes and map its ID to the streaming ID for consistent function responses. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(adk-middleware): stream FC args for opted-in LRO/HITL tools Allow specific LRO tools to stream their args incrementally via TOOL_CALL_ARGS events while still pausing for user input. Tools opt in via `stream_tool_call=True` on PredictStateMapping. The streaming END is deferred until the confirmed LRO event arrives, at which point the PredictState CustomEvent and TOOL_CALL_END are emitted together. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): skip client_tool_names filter for non-resumable agents and filter backend tools from streaming FC Two fixes: 1. Add is_resumable flag to EventTranslator so the client_tool_names filter in translate_lro_function_calls only applies for resumable agents (where ClientProxyTool handles emission). Non-resumable agents (agentic chat, haiku) now correctly emit tool call events via the LRO path. 2. Filter out backend tools (e.g. google_search) from streaming FC args when streaming_function_call_arguments is enabled, so ADK can execute them server-side without interference. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(adk-middleware): persist FunctionCall on early return and emit terminal events for synthetic tool results Two fixes: 1. When the aggregator monkey-patch (apply_aggregator_patch) is active globally, ADK yields FunctionCall events as partial/streaming events that aren't persisted to the session. Non-resumable agents that return early at LRO detection leave the session without the FunctionCall, causing "No function call event found" on the next run. Fix: manually persist the FunctionCall event before early return. 2. When confirm_changes (synthetic) tool results have no trailing messages, _handle_tool_result_submission returned without yielding any events, producing an empty SSE stream that triggers INCOMPLETE_STREAM on the client. Fix: emit RUN_STARTED + RUN_FINISHED for a valid terminal stream. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * refactor(adk-middleware): remove all streaming function call arguments logic Remove Mode A (Gemini 3+ stream_function_call_arguments) and Mode B (accumulated args delta) streaming infrastructure. The upstream ADK bug (google/adk-python#4311) makes this unreliable; a resurrection document is included for when the fix lands. - Delete workarounds.py (aggregator patch, thought-signature repair) - Remove streaming state vars and methods from EventTranslator - Remove stream_tool_call from PredictStateMapping - Simplify predictive_state_updates example - Add STREAMING_FC_ARGS_RECONSTRUCTION.md for future re-implementation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * deprecate(adk-middleware): warn on non-resumable HITL flow, recommend from_app() with ResumabilityConfig The fire-and-forget HITL path via ADKAgent(adk_agent=...) is now deprecated for human-in-the-loop workflows. A DeprecationWarning is emitted at runtime when the old-style early-return is triggered. The direct constructor remains fully supported for agents without client-side tools. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* chore: release core packages * chore: release mastra sdk

BREAKING CHANGES

contextablemark and others added 4 commits January 31, 2026 18:18

chore: release sdks (#1040)

7d1ad30

* chore: release core packages * chore: release mastra sdk

feat!(mastra): Support 1.0.0 (#685)

e56e66a

BREAKING CHANGES

chore: release mastra sdk v1 (#1042)

6bd15ad

contextablemark merged commit 6bbe52c into Contextable:main Feb 2, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge from main#132

Merge from main#132
contextablemark merged 4 commits intoContextable:mainfrom
ag-ui-protocol:main

contextablemark commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

contextablemark commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants