-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
O3 Mini support #1709
O3 Mini support #1709
Conversation
<!-- ELLIPSIS_HIDDEN --> > [!IMPORTANT] > Add O3 Mini model support and update token handling in LLM configurations and API handlers. > > - **Behavior**: > - Add support for `OPENAI_O3_MINI` in `config_registry.py` with `max_completion_tokens=16384`, `temperature=None`, and `reasoning_effort="high"`. > - Update `llm_api_handler_with_router_and_fallback()` in `api_handler_factory.py` to include reasoning tokens in `completion_tokens` calculation. > - **Configuration**: > - Rename `max_output_tokens` to `max_completion_tokens` in `router.py`, `api_handler_factory.py`, and `config_registry.py`. > - Add `reasoning_effort` and `temperature` to `LLMConfig` and `LLMRouterConfig` in `models.py`. > - **Misc**: > - Update `get_api_parameters()` in `api_handler_factory.py` to include `reasoning_effort` and `temperature` if present. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for a8a4cc46d507592d53b376afa2eccd6a2765bf9d. It will automatically update as commits are pushed.</sup> <!-- ELLIPSIS_HIDDEN -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 94bcff5 in 4 minutes and 38 seconds
More details
- Looked at
134
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
10
drafted comments based on config settings.
1. skyvern/forge/sdk/api/llm/api_handler_factory.py:159
- Draft comment:
Ensure logging at info level for reasoning_tokens is appropriate; info may clutter production logs. - Reason this comment was not posted:
Marked as duplicate.
2. skyvern/forge/sdk/api/llm/api_handler_factory.py:396
- Draft comment:
Parameter keys updated (max_completion_tokens, temperature, reasoning_effort) should be consistent with model definitions. - Reason this comment was not posted:
Marked as duplicate.
3. skyvern/forge/sdk/api/llm/config_registry.py:87
- Draft comment:
New provider OPENAI_O3_MINI config: Verify that required and optional fields align with the provider's capabilities. - Reason this comment was not posted:
Marked as duplicate.
4. skyvern/forge/sdk/api/llm/models.py:39
- Draft comment:
Renaming from 'max_output_tokens' to 'max_completion_tokens' looks consistent; ensure all usages in the project are updated. - Reason this comment was not posted:
Marked as duplicate.
5. skyvern/forge/sdk/api/llm/api_handler_factory.py:159
- Draft comment:
Combine reasoning_tokens with completion_tokens: ensure this aggregation is intended and won’t double‐count. Consider logging only nonzero values. - Reason this comment was not posted:
Comment did not seem useful.
6. skyvern/forge/sdk/api/llm/api_handler_factory.py:396
- Draft comment:
Updated API parameters now use 'max_completion_tokens' and conditionally add temperature and reasoning_effort. Verify that any consumers relying on the old keys are updated accordingly. - Reason this comment was not posted:
Comment did not seem useful.
7. skyvern/forge/sdk/api/llm/config_registry.py:87
- Draft comment:
New OPENAI_O3_MINI config: temperature is set to None and reasoning_effort to 'high'. Ensure downstream handling respects that temperature is not applicable for O-models. - Reason this comment was not posted:
Comment did not seem useful.
8. skyvern/forge/sdk/api/llm/config_registry.py:83
- Draft comment:
Token field renaming: 'max_output_tokens' has been replaced with 'max_completion_tokens'. Confirm that all code (e.g., cost calculations) now references the updated field. - Reason this comment was not posted:
Marked as duplicate.
9. skyvern/forge/sdk/api/llm/models.py:39
- Draft comment:
LLMConfig update: Renamed 'max_output_tokens' to 'max_completion_tokens' and added 'temperature' and 'reasoning_effort'. Verify that default values from SettingsManager are appropriate. - Reason this comment was not posted:
Marked as duplicate.
10. skyvern/forge/sdk/api/llm/models.py:77
- Draft comment:
LLMRouterConfig update: 'max_output_tokens' is now 'max_completion_tokens' along with new fields. Ensure consistency with LLMConfig and check downstream usage. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_xUZOas1iHu7KKVBn
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 94bcff5 in 5 minutes and 0 seconds
More details
- Looked at
134
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
9
drafted comments based on config settings.
1. skyvern/forge/sdk/api/llm/api_handler_factory.py:159
- Draft comment:
Ensure addition of reasoning_tokens to completion_tokens is intended. Also, revisit the TODO comment to fully support reasoning tokens. - Reason this comment was not posted:
Comment did not seem useful.
2. skyvern/forge/sdk/api/llm/api_handler_factory.py:396
- Draft comment:
Parameter key update to 'max_completion_tokens' reflects model changes. Confirm that downstream APIs expect this key. - Reason this comment was not posted:
Comment did not seem useful.
3. skyvern/forge/sdk/api/llm/config_registry.py:83
- Draft comment:
Rename of 'max_output_tokens' to 'max_completion_tokens' is consistent. Verify that all dependent logic uses the new property name. - Reason this comment was not posted:
Marked as duplicate.
4. skyvern/forge/sdk/api/llm/models.py:39
- Draft comment:
Renaming 'max_output_tokens' to 'max_completion_tokens' is consistent. Ensure documentation and usage elsewhere also reflect this change. - Reason this comment was not posted:
Marked as duplicate.
5. skyvern/forge/sdk/api/llm/api_handler_factory.py:159
- Draft comment:
Consider whether summing 'reasoning_tokens' with 'completion_tokens' is the desired metric. Since the TODO indicates further work on token types, it might be useful to retain them separately for clearer usage analytics. - Reason this comment was not posted:
Marked as duplicate.
6. skyvern/forge/sdk/api/llm/api_handler_factory.py:396
- Draft comment:
The updated get_api_parameters now uses 'max_completion_tokens' (and adds 'temperature' and 'reasoning_effort'). Ensure all consumers and documentation reflect this change. - Reason this comment was not posted:
Confidence changes required:33%
None
7. skyvern/forge/sdk/api/llm/config_registry.py:87
- Draft comment:
New configuration 'OPENAI_O3_MINI' is added with temperature set to None and reasoning_effort set to 'high'. Verify that using 'OPENAI_API_KEY' and these specific parameters fit the provider's requirements. - Reason this comment was not posted:
Confidence changes required:33%
None
8. skyvern/forge/sdk/api/llm/config_registry.py:81
- Draft comment:
Changing 'max_output_tokens' to 'max_completion_tokens' across configurations is consistent. Confirm that related tests and documentation are updated accordingly. - Reason this comment was not posted:
Confidence changes required:33%
None
9. skyvern/forge/sdk/api/llm/models.py:39
- Draft comment:
The attribute change from 'max_output_tokens' to 'max_completion_tokens', along with adding 'temperature' and 'reasoning_effort', should be verified against API expectations and provider docs. - Reason this comment was not posted:
Confidence changes required:33%
None
Workflow ID: wflow_wffN1KFYVMHDPVGs
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Important
Add support for O3 Mini model with reasoning tokens handling and update configurations and models accordingly.
OPENAI_O3_MINI
inconfig_registry.py
withmax_completion_tokens=16384
,temperature=None
, andreasoning_effort="high"
.llm_api_handler_with_router_and_fallback()
inapi_handler_factory.py
to handlereasoning_tokens
by adding them tocompletion_tokens
.reasoning_effort
toLLMConfig
andLLMRouterConfig
inmodels.py
.max_output_tokens
tomax_completion_tokens
inLLMConfig
andLLMRouterConfig
inmodels.py
.get_api_parameters()
inapi_handler_factory.py
to includereasoning_effort
if present.This description was created by for 94bcff5. It will automatically update as commits are pushed.