feat: support stable diffusion v1-5 with qnn #234

ziyuanguo1998 · 2026-02-10T02:37:07Z

No description provided.

Copilot

Pull request overview

Adds an AITK workflow + supporting scripts/configs to optimize and run Stable Diffusion v1.5 with ONNX Runtime QNN EP (plus supporting CPU/CUDA/OpenVINO paths), including data generation for static quantization and model adaptation/monkey-patching for QNN.

Changes:

Introduces a full sd-legacy-stable-diffusion-v1-5/aitk workflow (configs, optimization/inference scripts, evaluation tooling, sample notebook).
Adds QDQ/QNN pipeline utilities (EP registration, QDQ config shaping, ORT/OpenVINO pipelines, ONNX save/patch helpers).
Registers the model + dataset in .aitk configs and adds SD-specific requirements.

Reviewed changes

Copilot reviewed 30 out of 31 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
sd-legacy-stable-diffusion-v1-5/olive/winml.py	Helper to enumerate WinML execution provider library paths.
sd-legacy-stable-diffusion-v1-5/olive/sd_utils/ort.py	Updates Olive footprint filename lookup for optimized ONNX extraction.
sd-legacy-stable-diffusion-v1-5/aitk/winml.py	AITK-side helper to enumerate WinML execution provider library paths.
sd-legacy-stable-diffusion-v1-5/aitk/user_script.py	Model loaders, input builders, and dataloaders for Olive passes (incl. LoRA merge + QNN patch hook).
sd-legacy-stable-diffusion-v1-5/aitk/stable_diffusion.py	End-to-end CLI for optimizing and running SD v1.5 across providers/formats (incl. QDQ).
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/qdq_xl.py	ORT SDXL pipeline wrappers with data-save hooks (for QDQ data generation).
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/qdq.py	QDQ-specific Olive config shaping + ONNX pipeline wrapper + EP registration + QDQ pipeline loader.
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/ov.py	OpenVINO pipeline implementation and Olive config helpers for OV conversion/runtime.
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/ort.py	ORT/CUDA optimization helpers, footprint parsing, and pipeline materialization (incl. QNN ctx bin copy).
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/onnx_patch.py	Patched ONNX model wrapper to support saving external weights alongside ONNX artifacts.
sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/config.py	Shared runtime config values (sample sizes, flags, data dir).
sd-legacy-stable-diffusion-v1-5/aitk/sd_qnn_workflow.py	AITK workflow driver orchestrating conversion, data generation, and quantized model generation.
sd-legacy-stable-diffusion-v1-5/aitk/sd_qnn_workflow.json.config	AITK workflow UI/config template for QNN conversion + quantization + evaluation.
sd-legacy-stable-diffusion-v1-5/aitk/sd_qnn_workflow.json	Olive/AITK workflow definition for SD v1.5 QNN target.
sd-legacy-stable-diffusion-v1-5/aitk/model_project.config	Registers the workflow in the model project configuration.
sd-legacy-stable-diffusion-v1-5/aitk/model_adaptations.py	QNN-focused UNet monkey-patches (attention/activations/norm/proj changes) for compatibility/perf.
sd-legacy-stable-diffusion-v1-5/aitk/info.yml	AITK metadata for the SD v1.5 QNN recipe.
sd-legacy-stable-diffusion-v1-5/aitk/inference_sample.ipynb	Sample notebook demonstrating QNN EP registration and inference.
sd-legacy-stable-diffusion-v1-5/aitk/evaluation.py	Evaluation/data-generation script (CLIP/FID/MSE/HPSv2 hooks, dataset streaming).
sd-legacy-stable-diffusion-v1-5/aitk/config_vae_encoder.json	Olive config for VAE encoder conversion/optimization/quantization.
sd-legacy-stable-diffusion-v1-5/aitk/config_vae_decoder.json	Olive config for VAE decoder conversion/optimization/quantization (+ optional EP context bin).
sd-legacy-stable-diffusion-v1-5/aitk/config_unet.json	Olive config for UNet conversion/optimization/quantization (+ optional EP context bin).
sd-legacy-stable-diffusion-v1-5/aitk/config_text_encoder.json	Olive config for text encoder conversion/optimization/quantization (+ surgery/context bin).
sd-legacy-stable-diffusion-v1-5/aitk/config_safety_checker.json	Olive config for safety checker conversion/optimization.
sd-legacy-stable-diffusion-v1-5/aitk/README.md	Usage documentation for data generation, optimization, and evaluation.
sd-legacy-stable-diffusion-v1-5/aitk/.gitignore	Ignores generated caches, artifacts, and results.
.aitk/scripts/project_processor.py	Adds HuggingFace icon mapping for the SD v1.5 model family key.
.aitk/requirements/requirements-WCR-SD.txt	Adds SD workflow runtime requirements (accelerate/diffusers/torch-fidelity pins).
.aitk/configs/model_list.json	Registers SD v1.5 model entry and adds dataset link for `phiyodr/coco2017`.
.aitk/configs/checks.json	Updates check counters to reflect the new model/workflow assets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-10T02:43:34Z

sd-legacy-stable-diffusion-v1-5/aitk/stable_diffusion.py

+    optimized_model_dir = script_dir / "models" / "optimized" / model_id
+
+    if common_args.clean_cache:
+        shutil.rmtree(common_args.cache_dir, ignore_errors=True)


If --clean_cache is provided without --cache_dir, shutil.rmtree(common_args.cache_dir, ...) will raise a TypeError because cache_dir is None. Consider either making --cache_dir required when --clean_cache is set, or defaulting to a known cache path (e.g., script_dir / "cache") and guarding against None.

Suggested change

shutil.rmtree(common_args.cache_dir, ignore_errors=True)

# If no cache_dir was provided, default to a "cache" directory under script_dir.

cache_dir = common_args.cache_dir or (script_dir / "cache")

if cache_dir is not None:

shutil.rmtree(cache_dir, ignore_errors=True)

Copilot · 2026-02-10T02:43:34Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/qdq.py

+    worker_script = os.path.abspath('winml.py')
+    result = subprocess.check_output([sys.executable, worker_script], text=True)
+    paths = json.loads(result)


worker_script = os.path.abspath('winml.py') depends on the current working directory, so this can fail (or pick up an unintended/malicious winml.py) when the script is launched from another directory. Use a path relative to this module (e.g., Path(__file__).resolve().parents[1] / "winml.py" or similar) to ensure the intended helper is executed.

Copilot · 2026-02-10T02:43:34Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/qdq.py

+    if qdq_args.save_data:
+        pipeline.save_data_dir = script_dir / qdq_args.data_dir / common_args.prompt
+        os.makedirs(pipeline.save_data_dir, exist_ok=True)
+    else:


pipeline.save_data_dir = script_dir / qdq_args.data_dir / common_args.prompt uses the raw prompt as a directory name. Prompts can contain path separators or characters invalid on Windows, which can break saving or allow writing outside the intended directory. Sanitize/slugify the prompt (or hash it) before using it in a filesystem path.

Copilot · 2026-02-10T02:43:35Z

sd-legacy-stable-diffusion-v1-5/aitk/README.md

+
+### Test and evaluate
+
+`python .\evaluation.py --script_dir .\ --model_id stable-diffusion-v1-5/stable-diffusion-v1-5 --num_inference_steps 25 --seed 0 --num_data 100 --guidance_scale 7.5 --provider QNNExecutionProvider --model_dir optimized-qnn_qdq`


The README’s evaluation example uses --model_dir optimized-qnn_qdq, but stable_diffusion.py writes optimized models under models/optimized/<model_id> (no provider/format suffix). As written, the evaluation command will look in a directory that is never created; update the README command (or align the output directory naming in code).

Suggested change

`python .\evaluation.py --script_dir .\ --model_id stable-diffusion-v1-5/stable-diffusion-v1-5 --num_inference_steps 25 --seed 0 --num_data 100 --guidance_scale 7.5 --provider QNNExecutionProvider --model_dir optimized-qnn_qdq`

`python .\evaluation.py --script_dir .\ --model_id stable-diffusion-v1-5/stable-diffusion-v1-5 --num_inference_steps 25 --seed 0 --num_data 100 --guidance_scale 7.5 --provider QNNExecutionProvider --model_dir models/optimized/stable-diffusion-v1-5/stable-diffusion-v1-5`

Copilot · 2026-02-10T02:43:35Z

sd-legacy-stable-diffusion-v1-5/aitk/stable_diffusion.py

+    if not common_args.optimize:
+        model_dir = unoptimized_model_dir if common_args.test_unoptimized else optimized_model_dir
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore")
+            if provider == "openvino":
+                from sd_utils.ov import get_ov_pipeline
+
+                pipeline = get_ov_pipeline(common_args, ov_args, optimized_model_dir)
+            elif common_args.format == "qdq":
+                from sd_utils.qdq import get_qdq_pipeline
+
+                pipeline = get_qdq_pipeline(model_dir, common_args, qdq_args, script_dir)
+            else:


The QDQ export path uses fixed batch=1 shapes (see dynamic_shape_to_fixed in the configs), but the CLI still allows --batch_size > 1 and passes it through to get_qdq_pipeline. This will likely fail at runtime with shape mismatches. Consider enforcing batch_size==1 when --format qdq (or documenting/handling larger batches).

Copilot · 2026-02-10T02:43:35Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/qdq.py

+    elif provider == "qnn" and submodel_name not in ("vae_encoder"):
+        config["systems"]["local_system"]["accelerators"][0]["device"] = "npu"
+        config["systems"]["local_system"]["accelerators"][0]["execution_providers"] = ["QNNExecutionProvider"]
+        config["passes"]["convert"]["target_opset"] = 20
+
+        # Quantization params
+        if submodel_name not in ("text_encoder"):


The condition submodel_name not in ("vae_encoder") is using a string instead of a 1-element tuple, so it performs substring membership rather than comparing names. This is brittle and can lead to incorrect branching if submodel_name ever changes; use submodel_name != "vae_encoder" (or not in ("vae_encoder",)) instead. Same issue for the ("text_encoder") check below.

Suggested change

elif provider == "qnn" and submodel_name not in ("vae_encoder"):

config["systems"]["local_system"]["accelerators"][0]["device"] = "npu"

config["systems"]["local_system"]["accelerators"][0]["execution_providers"] = ["QNNExecutionProvider"]

config["passes"]["convert"]["target_opset"] = 20

# Quantization params

if submodel_name not in ("text_encoder"):

elif provider == "qnn" and submodel_name != "vae_encoder":

config["systems"]["local_system"]["accelerators"][0]["device"] = "npu"

config["systems"]["local_system"]["accelerators"][0]["execution_providers"] = ["QNNExecutionProvider"]

config["passes"]["convert"]["target_opset"] = 20

# Quantization params

if submodel_name != "text_encoder":

Copilot · 2026-02-10T02:43:35Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/ov.py

+        else:
+            if "src_height" in meta:
+                orig_height, orig_width = meta["src_height"], meta["src_width"]
+                image = [cv2.resize(img, (orig_width, orig_width)) for img in image]


cv2.resize(img, (orig_width, orig_width)) uses the width value for both dimensions, which will distort images when orig_height != orig_width. This should resize to (orig_width, orig_height) (and note OpenCV expects size as (width, height)).

Suggested change

image = [cv2.resize(img, (orig_width, orig_width)) for img in image]

image = [cv2.resize(img, (orig_width, orig_height)) for img in image]

Copilot · 2026-02-10T02:43:36Z

sd-legacy-stable-diffusion-v1-5/aitk/model_adaptations.py

+        model (nn.Module): The model in which to replace Attention modules.
+
+    """
+    traverse_and_replace(model, attention_processor.Attention, lambda orig_attn: SHAAttention(orig_attn))


This 'lambda' is just a simple wrapper around a callable object. Use that object directly.

Suggested change

traverse_and_replace(model, attention_processor.Attention, lambda orig_attn: SHAAttention(orig_attn))

traverse_and_replace(model, attention_processor.Attention, SHAAttention)

Copilot · 2026-02-10T02:43:36Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/onnx_patch.py

+
+        try:
+            shutil.copyfile(src_path, dst_path)
+        except shutil.SameFileError:


'except' clause does nothing but pass and there is no explanatory comment.

Copilot · 2026-02-10T02:43:36Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_utils/onnx_patch.py

+            dst_path = Path(save_directory).joinpath(ONNX_EXTERNAL_WEIGHTS_NAME)
+            try:
+                shutil.copyfile(src_path, dst_path)
+            except shutil.SameFileError:


'except' clause does nothing but pass and there is no explanatory comment.

xieofxie · 2026-02-11T06:10:36Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_qnn_workflow.py

+        copy_olive_config(history_folder, config_name, cache_dir, output_dir, activation_type, precision)
+
+    # run stable_diffusion.py to generate onnx unoptimized model
+    subprocess.run([sys.executable, "stable_diffusion.py",


we had better follow whisper to share original model and skip if exist

xieofxie · 2026-02-11T06:11:11Z

sd-legacy-stable-diffusion-v1-5/aitk/sd_qnn_workflow.py

+
+    # # run evaluation.py to generate data
+    subprocess.run([sys.executable, "evaluation.py",
+                    "--script_dir", history_folder,


same for data to share, so no need to rerun if exist

xieofxie · 2026-02-11T06:11:24Z

sd-legacy-stable-diffusion-v1-5/olive/winml.py

@@ -0,0 +1,21 @@
+import json


xieofxie · 2026-02-11T07:54:11Z

sd-legacy-stable-diffusion-v1-5/aitk/assets/dog.png

do we need this file? if not, please remove as it is big

ZiyuanGuo added 9 commits January 26, 2026 11:03

init sd in aitk

b91d8cc

add basic arch for sd conv

755821d

fix footprint typo

1eb84b3

add sanitize check

85e935e

add sd requirements

049c3b2

add conv script

5325325

implement workflow in aitk

645e2a9

Merge branch 'main' into ziyuan/sd-qnn

a8db8be

fix whitespace

f6a6c47

Copilot AI review requested due to automatic review settings February 10, 2026 02:37

ziyuanguo1998 requested review from a team as code owners February 10, 2026 02:37

Copilot started reviewing on behalf of ziyuanguo1998 February 10, 2026 02:37 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

ZiyuanGuo added 2 commits February 10, 2026 12:29

fix eval no data config

6013735

update evaluation

895140f

xieofxie reviewed Feb 11, 2026

View reviewed changes

sd-legacy-stable-diffusion-v1-5/olive/winml.py

@@ -0,0 +1,21 @@

import json

Copy link

Contributor

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

add metrics calculation

597a2ef

xieofxie reviewed Feb 11, 2026

View reviewed changes

sd-legacy-stable-diffusion-v1-5/aitk/assets/dog.png

Copy link

Contributor

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this file? if not, please remove as it is big

-        shutil.rmtree(common_args.cache_dir, ignore_errors=True)
+        # If no cache_dir was provided, default to a "cache" directory under script_dir.
+        cache_dir = common_args.cache_dir or (script_dir / "cache")
+        if cache_dir is not None:
+            shutil.rmtree(cache_dir, ignore_errors=True)


		### Test and evaluate

		`python .\evaluation.py --script_dir .\ --model_id stable-diffusion-v1-5/stable-diffusion-v1-5 --num_inference_steps 25 --seed 0 --num_data 100 --guidance_scale 7.5 --provider QNNExecutionProvider --model_dir optimized-qnn_qdq`

	image = [cv2.resize(img, (orig_width, orig_width)) for img in image]
	image = [cv2.resize(img, (orig_width, orig_height)) for img in image]

	traverse_and_replace(model, attention_processor.Attention, lambda orig_attn: SHAAttention(orig_attn))
	traverse_and_replace(model, attention_processor.Attention, SHAAttention)

feat: support stable diffusion v1-5 with qnn #234

Are you sure you want to change the base?

feat: support stable diffusion v1-5 with qnn #234

Uh oh!

Conversation

ziyuanguo1998 commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

xieofxie Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants