[xpu] xpu backend run pass by yongqiangma · Pull Request #523 · PaddlePaddle/PaddleFleet

yongqiangma · 2026-02-06T01:17:08Z

添加XPU后端支持，验证导入正常python -c "import paddlefleet; print(f'paddlefleet {paddlefleet.version}')"
PaddleFormers中验证模型在依赖PaddleFleet情况下，可正常运行。

Copilot

Pull request overview

This pull request introduces XPU backend compatibility support for PaddleFleet by refactoring CUDA-specific operations to support multiple backends. The PR adds a backend abstraction layer for fused SwiGLU scale operations and guards CUDA-specific code paths in the ops initialization.

Note on PR Metadata:

Title Format Issue: The PR title "xpu backend run pass" doesn't follow the required format [CLASS]Title. It should be something like "[Feature] XPU backend support" or "[BugFix] XPU backend compatibility".
Missing Description: The PR lacks a description explaining why these modifications are being made and what problem is being solved. The description should explain the motivation for adding XPU backend support and how the abstraction layer enables multi-backend compatibility.

Changes:

Created a new backend abstraction layer (fused_swiglu_scale.py) that conditionally routes to CUDA-specific implementations
Updated import paths from direct paddlefleet.ops imports to the new abstraction layer
Added CUDA-specific guards to prevent loading CUDA-only ecosystem libraries (deep_gemm, deep_ep, sonicmoe) on non-CUDA backends
Added initialization guard to prevent duplicate backend detection in backends.py
Added fallback for dependency resolution failures in setup.py

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`src/paddlefleet/fusions/fused_swiglu_scale.py`	New backend abstraction layer providing forward/backward functions with CUDA detection
`src/paddlefleet/transformer/moe/fp8_utils.py`	Updated imports to use new abstraction layer functions instead of direct ops imports
`tests/single_card_tests/custom_ops/test_fuse_swiglu_scale.py`	Updated test imports to use new abstraction layer
`src/paddlefleet/ops/__init__.py`	Guarded CUDA-specific ecosystem library loading with backend detection
`src/paddlefleet/fusions/fused_bias_swiglu.py`	Added CUDA backend check before importing fused_swiglu_bwd
`setup.py`	Added exception handling for dependency resolution failures
`backends.py`	Added initialization guard and auto-initialization at module level

Copilot · 2026-02-06T01:21:54Z

setup.py

+except Exception:
+    # Fallback if dependency resolution fails
+    dependencies = common_dependencies
+    logging.warning(
+        "Failed to resolve special dependencies, using common dependencies only"
+    )


The broad Exception catch could silently hide important errors during dependency resolution. Consider catching more specific exceptions (e.g., ImportError, ModuleNotFoundError) or at minimum logging the actual exception details to help diagnose issues. For example: logging.warning(f"Failed to resolve special dependencies: {e}, using common dependencies only")

Copilot · 2026-02-06T01:21:55Z

src/paddlefleet/fusions/fused_swiglu_scale.py

+def fused_swiglu_scale_forward(x, scale):
+    if paddle.is_compiled_with_cuda():
+        from paddlefleet.ops import fused_swiglu_scale
+
+        return fused_swiglu_scale(x, scale)


Consider adding validation or error handling for the case where the import succeeds but the actual CUDA operation fails at runtime. For robustness, you might want to wrap the call in a try-except block and provide a meaningful error message if the CUDA operation encounters issues.

Copilot · 2026-02-06T01:21:55Z

src/paddlefleet/fusions/fused_swiglu_scale.py

+def fused_swiglu_scale_forward(x, scale):
+    if paddle.is_compiled_with_cuda():
+        from paddlefleet.ops import fused_swiglu_scale
+
+        return fused_swiglu_scale(x, scale)
+    else:
+        raise NotImplementedError
+
+
+def fused_swiglu_scale_backward(x, scale, out_grad):
+    if paddle.is_compiled_with_cuda():
+        from paddlefleet.ops import fused_swiglu_scale_bwd
+
+        return fused_swiglu_scale_bwd(x, scale, out_grad)
+    else:
+        raise NotImplementedError


These functions lack docstrings describing their purpose, parameters, and return values. Following the pattern in other fusion files (e.g., fused_bias_swiglu.py lines 32-42, 71-81), add comprehensive docstrings that describe the SwiGLU scale operation, input parameters (x, scale), and return value to help users understand what these functions do.

Copilot · 2026-02-06T01:21:55Z

src/paddlefleet/fusions/fused_swiglu_scale.py

+
+        return fused_swiglu_scale(x, scale)
+    else:
+        raise NotImplementedError


The NotImplementedError should include a descriptive error message to help users understand what backend is not supported and what alternatives they might have. For example: "fused_swiglu_scale_forward is only supported on CUDA backend. Current backend does not support this operation."

Copilot · 2026-02-06T01:21:55Z

src/paddlefleet/fusions/fused_swiglu_scale.py

+
+        return fused_swiglu_scale_bwd(x, scale, out_grad)
+    else:
+        raise NotImplementedError


The NotImplementedError should include a descriptive error message to help users understand what backend is not supported and what alternatives they might have. For example: "fused_swiglu_scale_backward is only supported on CUDA backend. Current backend does not support this operation."

Copilot · 2026-02-06T01:21:56Z

src/paddlefleet/fusions/fused_bias_swiglu.py

+    if paddle.is_compiled_with_cuda():
+        from paddlefleet.ops import fused_swiglu_bwd
+
+        return fused_swiglu_bwd(g, y)
+    else:
+        logger.error(
+            "\033[91m fused_swiglu_bwd is not implemented for this backend! \033[0m"
+        )


The function logs an error but doesn't raise an exception or return a fallback value. This could lead to silent failures or undefined behavior. Consider either raising a NotImplementedError with a clear message or providing a fallback implementation for non-CUDA backends. The current implementation is inconsistent with the pattern in fused_swiglu_scale.py which raises NotImplementedError.

Copilot · 2026-02-06T01:21:56Z

backends.py

        )
+
+
+init_backend_type()


Consider adding a module-level docstring or inline comment explaining the purpose of the _initialized flag and why automatic initialization is needed at module import time (line 80). This would help future maintainers understand why init_backend_type() is called at the module level.

codecov-commenter · 2026-02-06T03:32:22Z

Codecov Report

❌ Patch coverage is 74.57627% with 15 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@8374690). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
src/paddlefleet/ops/__init__.py	76.31%	5 Missing and 4 partials ⚠️
src/paddlefleet/fusions/fused_swiglu_scale.py	63.63%	2 Missing and 2 partials ⚠️
src/paddlefleet/fusions/fused_bias_swiglu.py	66.66%	1 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (74.57%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop     #523   +/-   ##
==========================================
  Coverage           ?   74.57%           
==========================================
  Files              ?        4           
  Lines              ?       59           
  Branches           ?        9           
==========================================
  Hits               ?       44           
  Misses             ?        8           
  Partials           ?        7

Flag	Coverage Δ
coverage_combine	`74.57% <74.57%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/paddlefleet/transformer/moe/fp8_utils.py	`100.00% <100.00%> (ø)`
src/paddlefleet/fusions/fused_bias_swiglu.py	`66.66% <66.66%> (ø)`
src/paddlefleet/fusions/fused_swiglu_scale.py	`63.63% <63.63%> (ø)`
src/paddlefleet/ops/__init__.py	`76.31% <76.31%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

setup.py

src/paddlefleet/fusions/fused_bias_swiglu.py

SigureMo · 2026-02-09T02:44:04Z

backends.py

        )
+
+
+init_backend_type()


build_backend.py 里的 backends.init_backend_type() 是不是可以删一下了？

之前是这样想的。后来考虑上面加了只执行一次的逻辑，而且这个可能会在不同的阶段被引用，不一定一次init，就会全局生效。这样会更保险些，且一般不会在性能敏感的部分使用这个接口。

现在的话，只要 import backend 就一定会生效，不过倒也没啥影响

xpu backend run pass

22facc1

Copilot AI review requested due to automatic review settings February 6, 2026 01:17

Copilot started reviewing on behalf of yongqiangma February 6, 2026 01:17 View session

yongqiangma changed the title ~~xpu backend run pass~~ [xpu] xpu backend run pass Feb 6, 2026

Copilot AI reviewed Feb 6, 2026

View reviewed changes

fix format

d47e1f1

SigureMo reviewed Feb 9, 2026

View reviewed changes

risemeup1 previously approved these changes Feb 9, 2026

View reviewed changes

fix NotImplementedError type

feac3fd

yongqiangma dismissed risemeup1’s stale review via feac3fd February 9, 2026 03:59

SigureMo approved these changes Feb 9, 2026

View reviewed changes

Conversation

yongqiangma commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 6, 2026

Codecov Report

Uh oh!

Uh oh!

Uh oh!

SigureMo Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

yongqiangma Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

SigureMo Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yongqiangma commented Feb 6, 2026 •

edited

Loading