Skip to content

Releases: Gyarbij/azure-oai-proxy

1.0.9

03 Aug 22:06
20983d3
Compare
Choose a tag to compare

What's Changed

  • Add support for codex to responsesapi by @Gyarbij in #53

Full Changelog: 1.0.8...1.0.9

1.0.8

03 Aug 14:16
a6a2d0e
Compare
Choose a tag to compare

🚀 Azure OpenAI Proxy v1.0.8 - Responses API Support

🎯 Major New Features

✨ Azure OpenAI Responses API Integration

  • Full Responses API Support: Added comprehensive support for Azure OpenAI's new Responses API, enabling advanced reasoning models like O1, O3, and O4 to work seamlessly through the proxy
  • Automatic Model Detection: Intelligent detection of reasoning models (O1, O3, O4 families) that require the Responses API instead of traditional chat completions
  • Transparent Request Conversion: Automatic conversion of OpenAI chat completion requests to Azure Responses API format when using reasoning models
  • Streaming Response Conversion: Real-time conversion of Responses API Server-Sent Events (SSE) to OpenAI chat completion streaming format for full UI compatibility

🔄 Enhanced Streaming Support

  • Bidirectional Streaming Conversion:
    • Incoming: OpenAI chat completion format → Azure Responses API format
    • Outgoing: Azure Responses API SSE → OpenAI chat completion SSE
  • Event Type Handling: Proper handling of various Responses API events:
    • response.output_text.delta → chat completion delta chunks
    • response.completed → final [DONE] marker
    • Filtering of internal events (response.in_progress, response.output_item.added, etc.)

📡 New API Endpoints

  • POST /v1/responses - Create response
  • GET /v1/responses/:response_id - Retrieve response
  • DELETE /v1/responses/:response_id - Delete response
  • POST /v1/responses/:response_id/cancel - Cancel background response
  • GET /v1/responses/:response_id/input_items - List input items

🔧 Technical Improvements

🏗️ Architecture Enhancements

  • Modular Streaming Converter: New StreamingResponseConverter component for clean separation of concerns
  • Enhanced Request Director: Updated routing logic to handle both traditional and Responses API endpoints
  • Improved Response Modification: Better response transformation pipeline with proper content-type detection

🎛️ Configuration Updates

  • New API Version Support: Added AzureOpenAIResponsesAPIVersion = "preview" for Responses API endpoints
  • Enhanced Model Detection: Dynamic detection of reasoning models without hardcoding in model mapper
  • Request Context Preservation: Maintains original request information through custom headers for proper response conversion

🔍 Request/Response Flow

  1. Request Analysis: Detects if incoming chat completion request uses reasoning model
  2. Format Conversion: Converts OpenAI messages format to Responses API input format
  3. API Routing: Routes to appropriate Azure endpoint (/openai/v1/responses vs /openai/deployments/{model}/chat/completions)
  4. Response Processing:
    • Non-streaming: Direct JSON conversion to chat completion format
    • Streaming: Real-time SSE event conversion with proper chunk formatting

🐛 Bug Fixes

  • Fixed streaming response handling that previously caused UI display issues
  • Improved error handling for malformed streaming responses
  • Better type safety in response conversion functions
  • Enhanced logging for debugging streaming conversion issues

📋 API Compatibility Matrix

Model Family Endpoint Used Streaming Non-Streaming Status
GPT-3.5/4 Chat Completions Stable
GPT-4o Chat Completions Stable
O1 Series Responses API New
O3 Series Responses API New
O4 Series Responses API New

🔄 Backward Compatibility

  • Fully Backward Compatible: All existing functionality remains unchanged
  • Transparent Operation: Reasoning models automatically use Responses API without client-side changes
  • Consistent API Surface: Clients continue using standard OpenAI chat completion format

🛠️ Developer Experience

  • Enhanced Logging: Detailed request/response transformation logging for debugging
  • Type Definitions: Complete type definitions for Responses API structures
  • Error Handling: Improved error messages and graceful degradation

📝 Usage Example

# This now works seamlessly with O3 models
curl http://localhost:11437/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-azure-api-key" \
  -d '{
    "model": "o3-pro",
    "messages": [{"role": "user", "content": "Solve this complex reasoning problem..."}],
    "stream": true
  }'

🎉 What This Means

  • UI Compatibility: Tools like Open WebUI now work perfectly with Azure's most advanced reasoning models
  • Future-Proof: Ready for new reasoning models as they're released
  • Performance: Efficient streaming with minimal latency overhead
  • Simplicity: Zero configuration changes needed for existing deployments

This release represents a significant milestone in bridging OpenAI and Azure OpenAI APIs, particularly for next-generation reasoning models. The proxy now provides complete feature parity for both traditional and advanced AI models while maintaining the simplicity and reliability users expect.

1.0.7

24 Jan 14:59
5664a4b
Compare
Choose a tag to compare

What's Changed

  • chore(deps): bump github.com/tidwall/gjson from 1.17.3 to 1.18.0 by @dependabot in #34
  • chore(deps): bump golang from 1.23.0 to 1.23.2 by @dependabot in #35
  • chore: update Azure OpenAI API versions and enhance model mapping by @Gyarbij in #42

Full Changelog: 1.0.6...1.0.7

1.0.6

27 Aug 17:51
8e07839
Compare
Choose a tag to compare

What's Changed

Full Changelog: 1.0.5...1.0.6

1.0.5

25 Jul 21:17
9dec7bc
Compare
Choose a tag to compare

What's Changed

  • Added support for 'gpt-4o-mini'
  • Added support for Azure AI Studio serverless deployments
  • Now Supports Mistral Large 2407, LlaMa 3.1 405B, Command-r-plus and more.
  • 1.0.4 by @Gyarbij in #21
  • 1.0.5 by @Gyarbij in #22

Full Changelog: 1.0.4...1.0.5

1.0.4

20 Jul 17:38
48a0d35
Compare
Choose a tag to compare

What's Changed

  • Merge pull request #14 from Gyarbij/dev by @Gyarbij in #17
  • 1.0.4-rc-patch-2 by @Gyarbij in #20
  • Bump golang from 1.22.4 to 1.22.5 by @dependabot in #16
  • Update makeDirector function to handle new endpoint structure and add logging for new parameters and convert to 'io' from deprectated 'ioutil'
  • Refactor HandleToken function to improve readability and handle API key retrieval per updated Azure API
  • Improved performance

Full Changelog: 1.0.3...1.0.4

1.0.3

23 Jun 11:56
dc26227
Compare
Choose a tag to compare

What's Changed

Full Changelog: 1.0.2...1.0.3

1.0.1

23 Jun 00:44
579d2da
Compare
Choose a tag to compare

What's Changed

  • Added multi-platform image.
  • 2024-06-23 Implemented dynamic model fetching for /v1/models endpoint, replacing hardcoded model list.
  • 2024-06-23 Unified token handling mechanism across the application, improving consistency and security.
  • 2024-06-23 Added support for audio-related endpoints: /v1/audio/speech, /v1/audio/transcriptions, and /v1/audio/translations
  • 2024-06-23 Implemented flexible environment variable handling for configuration (AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_TOKEN).
  • 2024-06-23 Added support for model capabilities endpoint /v1/models/:model_id/capabilities.
  • 2024-06-23 Improved cross-origin resource sharing (CORS) handling with OPTIONS requests.
  • 2024-06-23 Enhanced proxy functionality to better handle various Azure OpenAI API endpoints.
  • 2024-06-23 Implemented fallback model mapping for unsupported models.
  • 2024-06-22 Added support for image generation /v1/images/generations, fine-tuning operations /v1/fine_tunes, and file management /v1/files.
  • 2024-06-22 Implemented better error handling and logging for API requests.
  • 2024-06-22 Improved handling of rate limiting and streaming responses.
  • 2024-06-22 Updated model mappings to include the latest models (gpt-4-turbo, gpt-4-vision-preview, dall-e-3).
  • 2024-06-23 Added support for deployments management (/deployments).
  • Update tag.yml by @Gyarbij in #11
  • 1.0.1 by @Gyarbij in #12

Full Changelog: https://github.com/Gyarbij/azure-oai-proxy/commits/1.0.1

1.0.1

23 Jun 00:11
883c5ba
Compare
Choose a tag to compare

What's Changed

  • Update tag.yml by @Gyarbij in #11
  • 1.0.1 by @Gyarbij in #12
  • Added multi-platform image.
  • 2024-06-23 Implemented dynamic model fetching for /v1/models endpoint, replacing hardcoded model list.
  • 2024-06-23 Unified token handling mechanism across the application, improving consistency and security.
  • 2024-06-23 Added support for audio-related endpoints: /v1/audio/speech, /v1/audio/transcriptions, and /v1/audio/translations
  • 2024-06-23 Implemented flexible environment variable handling for configuration (AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_TOKEN).
  • 2024-06-23 Added support for model capabilities endpoint /v1/models/:model_id/capabilities.
  • 2024-06-23 Improved cross-origin resource sharing (CORS) handling with OPTIONS requests.
  • 2024-06-23 Enhanced proxy functionality to better handle various Azure OpenAI API endpoints.
  • 2024-06-23 Implemented fallback model mapping for unsupported models.
  • 2024-06-22 Added support for image generation /v1/images/generations, fine-tuning operations /v1/fine_tunes, and file management /v1/files.
  • 2024-06-22 Implemented better error handling and logging for API requests.
  • 2024-06-22 Improved handling of rate limiting and streaming responses.
  • 2024-06-22 Updated model mappings to include the latest models (gpt-4-turbo, gpt-4-vision-preview, dall-e-3).
  • 2024-06-23 Added support for deployments management (/deployments).

Full Changelog: https://github.com/Gyarbij/azure-oai-proxy/commits/1.0.1