⚡ Bolt: Optimize rebuild_padding with lazy initialization#6484
⚡ Bolt: Optimize rebuild_padding with lazy initialization#6484
Conversation
Optimize `rebuild_padding` by caching the platform-specific implementation after the first call. This avoids repeated platform checks and imports in the critical path, improving efficiency for model execution. Platform-specific wrappers are created once and stored in `_rebuild_padding_impl` global variable. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
|
|
Thanks for your contribution! |
Optimize `rebuild_padding` by caching the platform-specific implementation after the first call. This avoids repeated platform checks and imports in the critical path, improving efficiency for model execution. Platform-specific wrappers are created once and stored in `_rebuild_padding_impl` global variable. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
Optimize `rebuild_padding` by caching the platform-specific implementation after the first call. This avoids repeated platform checks and imports in the critical path, improving efficiency for model execution. Platform-specific wrappers are created once and stored in `_rebuild_padding_impl` global variable. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
Optimize `rebuild_padding` by caching the platform-specific implementation after the first call. This avoids repeated platform checks and imports in the critical path, improving efficiency for model execution. Platform-specific wrappers are created once and stored in `_rebuild_padding_impl` global variable. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
…D check Optimize `rebuild_padding` by caching the platform-specific implementation after the first call. Includes a PID check to ensure robustness in multiprocessing environments (e.g., fork). - Caches (pid, implementation) to handle process forks correctly. - Avoids repeated platform checks and imports in the critical path. - Uses wrappers for platform-specific argument adaptation. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness in multiprocessing environments. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix code style (black) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix `ImportError` for `decode_alltoall_transpose` in `fastdeploy/distributed/communication.py` when running on HPU. - Fix code style (black) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix `ImportError` for `decode_alltoall_transpose` in `fastdeploy/distributed/communication.py` by guaranteeing definition of names outside `try/except` blocks. - Fix code style (black) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix `ImportError` for `decode_alltoall_transpose` in `fastdeploy/distributed/communication.py` by guaranteeing definition of names outside `try/except` blocks. - Fix code style (black, flake8) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix `ImportError` for `decode_alltoall_transpose` in `fastdeploy/distributed/communication.py` by guaranteeing definition of names outside `try/except` blocks. - Add unit test `tests/model_executor/test_rebuild_padding_dispatch.py` to cover platform dispatch logic and PID check, resolving CI coverage failure. - Fix code style (black, flake8) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
- Lazy initialize `rebuild_padding` in `fastdeploy/model_executor/pre_and_post_process.py` using a PID-safe caching mechanism to improve performance and robustness. - Guard `paddle.compat.enable_torch_proxy` calls with `hasattr` checks to fix CI failures on HPU environments where `paddle.compat` is missing. - Add fallback implementation for `swiglu` in `fastdeploy/model_executor/ops/iluvatar/moe_ops.py` to support environments where `paddle.nn.functional.swiglu` is unavailable. - Fix `ImportError` for `decode_alltoall_transpose` in `fastdeploy/distributed/communication.py` by guaranteeing definition of names outside `try/except` blocks. - Add unit test `tests/model_executor/test_rebuild_padding_dispatch.py` to cover platform dispatch logic and PID check, resolving CI coverage failure. - Fix code style (black, flake8, isort) formatting. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
Motivation
The
rebuild_paddingfunction infastdeploy/model_executor/pre_and_post_process.pyis called frequently during model execution. The original implementation performed platform checks (current_platform.is_cuda(), etc.) and imported the corresponding implementation on every call. This introduced unnecessary overhead.Modifications
rebuild_padding._rebuild_padding_implto cache the implementation._rebuild_padding_impl.Usage
This is an internal optimization and does not require changes to external usage. The
rebuild_paddingfunction retains the same signature and behavior.Accuracy Tests
test_optimization.py(mockingpaddleandfastdeployinternals) that the correct platform-specific implementation is called.Checklist
PR created automatically by Jules for task 4957221508923713247 started by @ZeyuChen