-
Notifications
You must be signed in to change notification settings - Fork 30
Add Qwen-3 LLM recipes #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Olive optimization recipes and per-model licensing for the Qwen-3 LLM family across CPU, CUDA, and WebGPU execution providers.
Changes:
- Added CPU, CUDA, and WebGPU Olive JSON recipes for Qwen-Qwen3-{0.6B, 1.7B, 4B, 4B-Instruct-2507, 4B-Thinking-2507, 8B, 14B, 32B}.
- Added per-backend README.md files explaining setup and usage for each model/backend combination.
- Added Apache 2.0 LICENSE files at the root of each Qwen-Qwen3-* model directory.
Reviewed changes
Copilot reviewed 56 out of 56 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| Qwen-Qwen3-8B/webgpu/README.md | WebGPU README for Qwen-Qwen3-8B Olive recipes and usage. |
| Qwen-Qwen3-8B/webgpu/Qwen-Qwen3-8B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-8B. |
| Qwen-Qwen3-8B/cuda/README.md | CUDA README for Qwen-Qwen3-8B Olive recipes and usage. |
| Qwen-Qwen3-8B/cuda/Qwen-Qwen3-8B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-8B. |
| Qwen-Qwen3-8B/cpu/README.md | CPU README for Qwen-Qwen3-8B Olive recipes and usage. |
| Qwen-Qwen3-8B/cpu/Qwen-Qwen3-8B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-8B. |
| Qwen-Qwen3-8B/LICENSE | Apache 2.0 license for Qwen-Qwen3-8B assets. |
| Qwen-Qwen3-4B/webgpu/README.md | WebGPU README for Qwen-Qwen3-4B Olive recipes and usage. |
| Qwen-Qwen3-4B/webgpu/Qwen-Qwen3-4B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-4B. |
| Qwen-Qwen3-4B/cuda/README.md | CUDA README for Qwen-Qwen3-4B Olive recipes and usage. |
| Qwen-Qwen3-4B/cuda/Qwen-Qwen3-4B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-4B. |
| Qwen-Qwen3-4B/cpu/README.md | CPU README for Qwen-Qwen3-4B Olive recipes and usage. |
| Qwen-Qwen3-4B/cpu/Qwen-Qwen3-4B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-4B. |
| Qwen-Qwen3-4B/LICENSE | Apache 2.0 license for Qwen-Qwen3-4B assets. |
| Qwen-Qwen3-4B-Thinking-2507/webgpu/README.md | WebGPU README for Qwen-Qwen3-4B-Thinking-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Thinking-2507/webgpu/Qwen-Qwen3-4B-Thinking-2507_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-4B-Thinking-2507. |
| Qwen-Qwen3-4B-Thinking-2507/cuda/README.md | CUDA README for Qwen-Qwen3-4B-Thinking-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Thinking-2507/cuda/Qwen-Qwen3-4B-Thinking-2507_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-4B-Thinking-2507. |
| Qwen-Qwen3-4B-Thinking-2507/cpu/README.md | CPU README for Qwen-Qwen3-4B-Thinking-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Thinking-2507/cpu/Qwen-Qwen3-4B-Thinking-2507_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-4B-Thinking-2507. |
| Qwen-Qwen3-4B-Thinking-2507/LICENSE | Apache 2.0 license for Qwen-Qwen3-4B-Thinking-2507 assets. |
| Qwen-Qwen3-4B-Instruct-2507/webgpu/README.md | WebGPU README for Qwen-Qwen3-4B-Instruct-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Instruct-2507/webgpu/Qwen-Qwen3-4B-Instruct-2507_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-4B-Instruct-2507. |
| Qwen-Qwen3-4B-Instruct-2507/cuda/README.md | CUDA README for Qwen-Qwen3-4B-Instruct-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Instruct-2507/cuda/Qwen-Qwen3-4B-Instruct-2507_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-4B-Instruct-2507. |
| Qwen-Qwen3-4B-Instruct-2507/cpu/README.md | CPU README for Qwen-Qwen3-4B-Instruct-2507 Olive recipes and usage. |
| Qwen-Qwen3-4B-Instruct-2507/cpu/Qwen-Qwen3-4B-Instruct-2507_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-4B-Instruct-2507. |
| Qwen-Qwen3-4B-Instruct-2507/LICENSE | Apache 2.0 license for Qwen-Qwen3-4B-Instruct-2507 assets. |
| Qwen-Qwen3-32B/webgpu/README.md | WebGPU README for Qwen-Qwen3-32B Olive recipes and usage. |
| Qwen-Qwen3-32B/webgpu/Qwen-Qwen3-32B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-32B. |
| Qwen-Qwen3-32B/cuda/README.md | CUDA README for Qwen-Qwen3-32B Olive recipes and usage. |
| Qwen-Qwen3-32B/cuda/Qwen-Qwen3-32B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-32B. |
| Qwen-Qwen3-32B/cpu/README.md | CPU README for Qwen-Qwen3-32B Olive recipes and usage. |
| Qwen-Qwen3-32B/cpu/Qwen-Qwen3-32B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-32B. |
| Qwen-Qwen3-32B/LICENSE | Apache 2.0 license for Qwen-Qwen3-32B assets. |
| Qwen-Qwen3-14B/webgpu/README.md | WebGPU README for Qwen-Qwen3-14B Olive recipes and usage. |
| Qwen-Qwen3-14B/webgpu/Qwen-Qwen3-14B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-14B. |
| Qwen-Qwen3-14B/cuda/README.md | CUDA README for Qwen-Qwen3-14B Olive recipes and usage. |
| Qwen-Qwen3-14B/cuda/Qwen-Qwen3-14B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-14B. |
| Qwen-Qwen3-14B/cpu/README.md | CPU README for Qwen-Qwen3-14B Olive recipes and usage. |
| Qwen-Qwen3-14B/cpu/Qwen-Qwen3-14B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-14B. |
| Qwen-Qwen3-14B/LICENSE | Apache 2.0 license for Qwen-Qwen3-14B assets. |
| Qwen-Qwen3-1.7B/webgpu/README.md | WebGPU README for Qwen-Qwen3-1.7B Olive recipes and usage. |
| Qwen-Qwen3-1.7B/webgpu/Qwen-Qwen3-1.7B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-1.7B. |
| Qwen-Qwen3-1.7B/cuda/README.md | CUDA README for Qwen-Qwen3-1.7B Olive recipes and usage. |
| Qwen-Qwen3-1.7B/cuda/Qwen-Qwen3-1.7B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-1.7B. |
| Qwen-Qwen3-1.7B/cpu/README.md | CPU README for Qwen-Qwen3-1.7B Olive recipes and usage. |
| Qwen-Qwen3-1.7B/cpu/Qwen-Qwen3-1.7B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-1.7B. |
| Qwen-Qwen3-1.7B/LICENSE | Apache 2.0 license for Qwen-Qwen3-1.7B assets. |
| Qwen-Qwen3-0.6B/webgpu/README.md | WebGPU README for Qwen-Qwen3-0.6B Olive recipes and usage. |
| Qwen-Qwen3-0.6B/webgpu/Qwen-Qwen3-0.6B_webgpu_int4_default.json | WebGPU INT4 Olive recipe for Qwen-Qwen3-0.6B. |
| Qwen-Qwen3-0.6B/cuda/README.md | CUDA README for Qwen-Qwen3-0.6B Olive recipes and usage. |
| Qwen-Qwen3-0.6B/cuda/Qwen-Qwen3-0.6B_cuda_int4_kquant_last.json | CUDA INT4 k_quant_last Olive recipe for Qwen-Qwen3-0.6B. |
| Qwen-Qwen3-0.6B/cpu/README.md | CPU README for Qwen-Qwen3-0.6B Olive recipes and usage. |
| Qwen-Qwen3-0.6B/cpu/Qwen-Qwen3-0.6B_cpu_int4_int8_kquant_mixed.json | CPU INT4/INT8 mixed k_quant_mixed Olive recipe for Qwen-Qwen3-0.6B. |
| Qwen-Qwen3-0.6B/LICENSE | Apache 2.0 license for Qwen-Qwen3-0.6B assets. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| }, | ||
| "engine": { "target": "local_system" }, | ||
| "passes": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth adding other quantization technique such as kld_gradient where we are getting better quality.
Description
This PR adds recipes for all Qwen-3 LLMs on the CPU EP, CUDA EP, and WebGPU EP.
Motivation and Context
The recipes were auto-generated with the following bash script.