Open
Conversation
9463dfe to
fd0ef19
Compare
fd0ef19 to
c65665d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
==1.1.10→==1.2.2Release Notes
vipshop/cache-dit (cache_dit)
v1.2.2Compare Source
What's Changed
Full Changelog: vipshop/cache-dit@v1.2.1...v1.2.2
v1.2.1: USP, 2D/3D ParallelCompare Source
🎉 v1.2.1 release is ready, the major updates including: Ring Attention w/ batched P2P, USP (Hybrid Ring and Ulysses), Hybrid 2D and 3D Parallelism (💥USP + TP), VAE-P Comm overhead reduce.
What's Changed
New Contributors
Full Changelog: vipshop/cache-dit@v1.2.0...v1.2.1
v1.2.0: Major Release: NPU, TE-P, VAE-P, CN-P, ...Compare Source
v1.2.0 Major Release: NPU, TE-P, VAE-P, CN-P, ...
Overviews
v1.2.0 is a Major Release after v1.1.0. We introduced many updates in v1.2.0, thereby further enhancing the ease of use and performance of Cache-DiT. We sincerely thank the contributors of Cache-DiT. The main updates for this time are as follows, includes:
🔥New Models Support
🔥Request level cache context
If you need to use a different num_inference_steps for each user request instead of a fixed value, you should use it in conjunction with
refresh_contextAPI. Before performing inference for each user request, update the cache context based on the actual number of steps. Please refer to 📚run_cache_refresh as an example.🔥HTTP Serving Support
🔥Context Parallelism Optimization
🔥Text Encoder Parallelism
Currently, cache-dit supported text encoder parallelism for T5Encoder, UMT5Encoder, Llama, Gemma 1/2/3, Mistral, Mistral-3, Qwen-3, Qwen-2.5 VL, Glm and Glm-4 model series, namely, supported almost 🔥ALL pipelines in diffusers.
Users can set the
extra_parallel_modulesparameter in parallelism_config (when using Tensor Parallelism or Context Parallelism) to specify additional modules that need to be parallelized beyond the main transformer — e.g,text_encoderinFlux2Pipeline. It can further reduce the per-GPU memory requirement and slightly improve the inference performance of the text encoder.🔥Auto Encoder (VAE) Parallelism
Currently, cache-dit supported auto encoder (vae) parallelism for AutoencoderKL, AutoencoderKLQwenImage, AutoencoderKLWan, and AutoencoderKLHunyuanVideo series, namely, supported almost 🔥ALL pipelines in diffusers. It can further reduce the per-GPU memory requirement and slightly improve the inference performance of the auto encoder. Users can set it by
extra_parallel_modulesparameter in parallelism_config, for example:🔥ControlNet Parallelism
Further, cache-dit even supported controlnet parallelism for specific models, such as Z-Image-Turbo with ControlNet. Users can set it by
extra_parallel_modulesparameter in parallelism_config, for example:🔥Ascend NPU Support
Cache-DiT now provides native support for Ascend NPU (by @gameofdimension @luren55 @DefTruth). Theoretically, nearly all models supported by Cache-DiT can run on Ascend NPU with most of Cache-DiT’s optimization technologies, including:
Please refer to Ascend NPU Supported Matrix for more details.
🔥Community Integration
Full Changelogs
SCM- step computation mask by @DefTruth in #450SCM- step computation mask by @DefTruth in #451Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.