Releases: Gyarbij/azure-oai-proxy
Releases · Gyarbij/azure-oai-proxy
1.0.9
1.0.8
🚀 Azure OpenAI Proxy v1.0.8 - Responses API Support
🎯 Major New Features
✨ Azure OpenAI Responses API Integration
- Full Responses API Support: Added comprehensive support for Azure OpenAI's new Responses API, enabling advanced reasoning models like O1, O3, and O4 to work seamlessly through the proxy
- Automatic Model Detection: Intelligent detection of reasoning models (O1, O3, O4 families) that require the Responses API instead of traditional chat completions
- Transparent Request Conversion: Automatic conversion of OpenAI chat completion requests to Azure Responses API format when using reasoning models
- Streaming Response Conversion: Real-time conversion of Responses API Server-Sent Events (SSE) to OpenAI chat completion streaming format for full UI compatibility
🔄 Enhanced Streaming Support
- Bidirectional Streaming Conversion:
- Incoming: OpenAI chat completion format → Azure Responses API format
- Outgoing: Azure Responses API SSE → OpenAI chat completion SSE
- Event Type Handling: Proper handling of various Responses API events:
response.output_text.delta
→ chat completion delta chunksresponse.completed
→ final[DONE]
marker- Filtering of internal events (
response.in_progress
,response.output_item.added
, etc.)
📡 New API Endpoints
POST /v1/responses
- Create responseGET /v1/responses/:response_id
- Retrieve responseDELETE /v1/responses/:response_id
- Delete responsePOST /v1/responses/:response_id/cancel
- Cancel background responseGET /v1/responses/:response_id/input_items
- List input items
🔧 Technical Improvements
🏗️ Architecture Enhancements
- Modular Streaming Converter: New
StreamingResponseConverter
component for clean separation of concerns - Enhanced Request Director: Updated routing logic to handle both traditional and Responses API endpoints
- Improved Response Modification: Better response transformation pipeline with proper content-type detection
🎛️ Configuration Updates
- New API Version Support: Added
AzureOpenAIResponsesAPIVersion = "preview"
for Responses API endpoints - Enhanced Model Detection: Dynamic detection of reasoning models without hardcoding in model mapper
- Request Context Preservation: Maintains original request information through custom headers for proper response conversion
🔍 Request/Response Flow
- Request Analysis: Detects if incoming chat completion request uses reasoning model
- Format Conversion: Converts OpenAI messages format to Responses API input format
- API Routing: Routes to appropriate Azure endpoint (
/openai/v1/responses
vs/openai/deployments/{model}/chat/completions
) - Response Processing:
- Non-streaming: Direct JSON conversion to chat completion format
- Streaming: Real-time SSE event conversion with proper chunk formatting
🐛 Bug Fixes
- Fixed streaming response handling that previously caused UI display issues
- Improved error handling for malformed streaming responses
- Better type safety in response conversion functions
- Enhanced logging for debugging streaming conversion issues
📋 API Compatibility Matrix
Model Family | Endpoint Used | Streaming | Non-Streaming | Status |
---|---|---|---|---|
GPT-3.5/4 | Chat Completions | ✅ | ✅ | Stable |
GPT-4o | Chat Completions | ✅ | ✅ | Stable |
O1 Series | Responses API | ✅ | ✅ | New |
O3 Series | Responses API | ✅ | ✅ | New |
O4 Series | Responses API | ✅ | ✅ | New |
🔄 Backward Compatibility
- Fully Backward Compatible: All existing functionality remains unchanged
- Transparent Operation: Reasoning models automatically use Responses API without client-side changes
- Consistent API Surface: Clients continue using standard OpenAI chat completion format
🛠️ Developer Experience
- Enhanced Logging: Detailed request/response transformation logging for debugging
- Type Definitions: Complete type definitions for Responses API structures
- Error Handling: Improved error messages and graceful degradation
📝 Usage Example
# This now works seamlessly with O3 models
curl http://localhost:11437/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-azure-api-key" \
-d '{
"model": "o3-pro",
"messages": [{"role": "user", "content": "Solve this complex reasoning problem..."}],
"stream": true
}'
🎉 What This Means
- UI Compatibility: Tools like Open WebUI now work perfectly with Azure's most advanced reasoning models
- Future-Proof: Ready for new reasoning models as they're released
- Performance: Efficient streaming with minimal latency overhead
- Simplicity: Zero configuration changes needed for existing deployments
This release represents a significant milestone in bridging OpenAI and Azure OpenAI APIs, particularly for next-generation reasoning models. The proxy now provides complete feature parity for both traditional and advanced AI models while maintaining the simplicity and reliability users expect.
1.0.7
What's Changed
- chore(deps): bump github.com/tidwall/gjson from 1.17.3 to 1.18.0 by @dependabot in #34
- chore(deps): bump golang from 1.23.0 to 1.23.2 by @dependabot in #35
- chore: update Azure OpenAI API versions and enhance model mapping by @Gyarbij in #42
Full Changelog: 1.0.6...1.0.7
1.0.6
What's Changed
- 1.0.5 by @Gyarbij in #23
- chore(deps): bump github.com/tidwall/gjson from 1.17.1 to 1.17.2 by @dependabot in #25
- Sync 1.0.6-base by @Gyarbij in #26
- chore(deps): bump github.com/tidwall/gjson from 1.17.2 to 1.17.3 by @dependabot in #27
- chore(deps): bump golang from 1.22.5 to 1.22.6 by @dependabot in #28
- Base - Add Health and Update Go by @Gyarbij in #30
- 1.0.6-rc by @Gyarbij in #31
Full Changelog: 1.0.5...1.0.6
1.0.5
1.0.4
What's Changed
- Merge pull request #14 from Gyarbij/dev by @Gyarbij in #17
- 1.0.4-rc-patch-2 by @Gyarbij in #20
- Bump golang from 1.22.4 to 1.22.5 by @dependabot in #16
- Update makeDirector function to handle new endpoint structure and add logging for new parameters and convert to 'io' from deprectated 'ioutil'
- Refactor HandleToken function to improve readability and handle API key retrieval per updated Azure API
- Improved performance
Full Changelog: 1.0.3...1.0.4
1.0.3
1.0.1
What's Changed
- Added multi-platform image.
- 2024-06-23 Implemented dynamic model fetching for
/v1/models endpoint
, replacing hardcoded model list. - 2024-06-23 Unified token handling mechanism across the application, improving consistency and security.
- 2024-06-23 Added support for audio-related endpoints:
/v1/audio/speech
,/v1/audio/transcriptions
, and/v1/audio/translations
- 2024-06-23 Implemented flexible environment variable handling for configuration (AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_TOKEN).
- 2024-06-23 Added support for model capabilities endpoint
/v1/models/:model_id/capabilities
. - 2024-06-23 Improved cross-origin resource sharing (CORS) handling with OPTIONS requests.
- 2024-06-23 Enhanced proxy functionality to better handle various Azure OpenAI API endpoints.
- 2024-06-23 Implemented fallback model mapping for unsupported models.
- 2024-06-22 Added support for image generation
/v1/images/generations
, fine-tuning operations/v1/fine_tunes
, and file management/v1/files
. - 2024-06-22 Implemented better error handling and logging for API requests.
- 2024-06-22 Improved handling of rate limiting and streaming responses.
- 2024-06-22 Updated model mappings to include the latest models (gpt-4-turbo, gpt-4-vision-preview, dall-e-3).
- 2024-06-23 Added support for deployments management (/deployments).
- Update tag.yml by @Gyarbij in #11
- 1.0.1 by @Gyarbij in #12
Full Changelog: https://github.com/Gyarbij/azure-oai-proxy/commits/1.0.1
1.0.1
What's Changed
- Update tag.yml by @Gyarbij in #11
- 1.0.1 by @Gyarbij in #12
- Added multi-platform image.
- 2024-06-23 Implemented dynamic model fetching for
/v1/models endpoint
, replacing hardcoded model list. - 2024-06-23 Unified token handling mechanism across the application, improving consistency and security.
- 2024-06-23 Added support for audio-related endpoints:
/v1/audio/speech
,/v1/audio/transcriptions
, and/v1/audio/translations
- 2024-06-23 Implemented flexible environment variable handling for configuration (AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_TOKEN).
- 2024-06-23 Added support for model capabilities endpoint
/v1/models/:model_id/capabilities
. - 2024-06-23 Improved cross-origin resource sharing (CORS) handling with OPTIONS requests.
- 2024-06-23 Enhanced proxy functionality to better handle various Azure OpenAI API endpoints.
- 2024-06-23 Implemented fallback model mapping for unsupported models.
- 2024-06-22 Added support for image generation
/v1/images/generations
, fine-tuning operations/v1/fine_tunes
, and file management/v1/files
. - 2024-06-22 Implemented better error handling and logging for API requests.
- 2024-06-22 Improved handling of rate limiting and streaming responses.
- 2024-06-22 Updated model mappings to include the latest models (gpt-4-turbo, gpt-4-vision-preview, dall-e-3).
- 2024-06-23 Added support for deployments management (/deployments).
Full Changelog: https://github.com/Gyarbij/azure-oai-proxy/commits/1.0.1