Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces XPU backend compatibility support for PaddleFleet by refactoring CUDA-specific operations to support multiple backends. The PR adds a backend abstraction layer for fused SwiGLU scale operations and guards CUDA-specific code paths in the ops initialization.
Note on PR Metadata:
- Title Format Issue: The PR title "xpu backend run pass" doesn't follow the required format [CLASS]Title. It should be something like "[Feature] XPU backend support" or "[BugFix] XPU backend compatibility".
- Missing Description: The PR lacks a description explaining why these modifications are being made and what problem is being solved. The description should explain the motivation for adding XPU backend support and how the abstraction layer enables multi-backend compatibility.
Changes:
- Created a new backend abstraction layer (
fused_swiglu_scale.py) that conditionally routes to CUDA-specific implementations - Updated import paths from direct
paddlefleet.opsimports to the new abstraction layer - Added CUDA-specific guards to prevent loading CUDA-only ecosystem libraries (deep_gemm, deep_ep, sonicmoe) on non-CUDA backends
- Added initialization guard to prevent duplicate backend detection in
backends.py - Added fallback for dependency resolution failures in
setup.py
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/paddlefleet/fusions/fused_swiglu_scale.py |
New backend abstraction layer providing forward/backward functions with CUDA detection |
src/paddlefleet/transformer/moe/fp8_utils.py |
Updated imports to use new abstraction layer functions instead of direct ops imports |
tests/single_card_tests/custom_ops/test_fuse_swiglu_scale.py |
Updated test imports to use new abstraction layer |
src/paddlefleet/ops/__init__.py |
Guarded CUDA-specific ecosystem library loading with backend detection |
src/paddlefleet/fusions/fused_bias_swiglu.py |
Added CUDA backend check before importing fused_swiglu_bwd |
setup.py |
Added exception handling for dependency resolution failures |
backends.py |
Added initialization guard and auto-initialization at module level |
setup.py
Outdated
| except Exception: | ||
| # Fallback if dependency resolution fails | ||
| dependencies = common_dependencies | ||
| logging.warning( | ||
| "Failed to resolve special dependencies, using common dependencies only" | ||
| ) |
There was a problem hiding this comment.
The broad Exception catch could silently hide important errors during dependency resolution. Consider catching more specific exceptions (e.g., ImportError, ModuleNotFoundError) or at minimum logging the actual exception details to help diagnose issues. For example: logging.warning(f"Failed to resolve special dependencies: {e}, using common dependencies only")
| def fused_swiglu_scale_forward(x, scale): | ||
| if paddle.is_compiled_with_cuda(): | ||
| from paddlefleet.ops import fused_swiglu_scale | ||
|
|
||
| return fused_swiglu_scale(x, scale) |
There was a problem hiding this comment.
Consider adding validation or error handling for the case where the import succeeds but the actual CUDA operation fails at runtime. For robustness, you might want to wrap the call in a try-except block and provide a meaningful error message if the CUDA operation encounters issues.
| def fused_swiglu_scale_forward(x, scale): | ||
| if paddle.is_compiled_with_cuda(): | ||
| from paddlefleet.ops import fused_swiglu_scale | ||
|
|
||
| return fused_swiglu_scale(x, scale) | ||
| else: | ||
| raise NotImplementedError | ||
|
|
||
|
|
||
| def fused_swiglu_scale_backward(x, scale, out_grad): | ||
| if paddle.is_compiled_with_cuda(): | ||
| from paddlefleet.ops import fused_swiglu_scale_bwd | ||
|
|
||
| return fused_swiglu_scale_bwd(x, scale, out_grad) | ||
| else: | ||
| raise NotImplementedError |
There was a problem hiding this comment.
These functions lack docstrings describing their purpose, parameters, and return values. Following the pattern in other fusion files (e.g., fused_bias_swiglu.py lines 32-42, 71-81), add comprehensive docstrings that describe the SwiGLU scale operation, input parameters (x, scale), and return value to help users understand what these functions do.
|
|
||
| return fused_swiglu_scale(x, scale) | ||
| else: | ||
| raise NotImplementedError |
There was a problem hiding this comment.
The NotImplementedError should include a descriptive error message to help users understand what backend is not supported and what alternatives they might have. For example: "fused_swiglu_scale_forward is only supported on CUDA backend. Current backend does not support this operation."
|
|
||
| return fused_swiglu_scale_bwd(x, scale, out_grad) | ||
| else: | ||
| raise NotImplementedError |
There was a problem hiding this comment.
The NotImplementedError should include a descriptive error message to help users understand what backend is not supported and what alternatives they might have. For example: "fused_swiglu_scale_backward is only supported on CUDA backend. Current backend does not support this operation."
| if paddle.is_compiled_with_cuda(): | ||
| from paddlefleet.ops import fused_swiglu_bwd | ||
|
|
||
| return fused_swiglu_bwd(g, y) | ||
| else: | ||
| logger.error( | ||
| "\033[91m fused_swiglu_bwd is not implemented for this backend! \033[0m" | ||
| ) |
There was a problem hiding this comment.
The function logs an error but doesn't raise an exception or return a fallback value. This could lead to silent failures or undefined behavior. Consider either raising a NotImplementedError with a clear message or providing a fallback implementation for non-CUDA backends. The current implementation is inconsistent with the pattern in fused_swiglu_scale.py which raises NotImplementedError.
| ) | ||
|
|
||
|
|
||
| init_backend_type() |
There was a problem hiding this comment.
Consider adding a module-level docstring or inline comment explaining the purpose of the _initialized flag and why automatic initialization is needed at module import time (line 80). This would help future maintainers understand why init_backend_type() is called at the module level.
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (74.57%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #523 +/- ##
==========================================
Coverage ? 74.57%
==========================================
Files ? 4
Lines ? 59
Branches ? 9
==========================================
Hits ? 44
Misses ? 8
Partials ? 7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| ) | ||
|
|
||
|
|
||
| init_backend_type() |
There was a problem hiding this comment.
build_backend.py 里的 backends.init_backend_type() 是不是可以删一下了?
There was a problem hiding this comment.
之前是这样想的。 后来考虑上面加了只执行一次的逻辑,而且这个可能会在不同的阶段被引用,不一定一次init,就会全局生效。这样会更保险些,且一般不会在性能敏感的部分使用这个接口。
There was a problem hiding this comment.
现在的话,只要 import backend 就一定会生效,不过倒也没啥影响
Uh oh!
There was an error while loading. Please reload this page.