v4.2.5
Pre-release🚨 macOS users may get black images when using LoRAs or IP Adapters. Users with CUDA GPUs may get unexpected OOMs. We are investigating. 🚨
v4.2.5 includes a handful of fixes and improvements, plus one exciting beta node - tiled upscaling via MultiDiffusion
.
If you missed v4.2.0, please review its release notes to get up to speed on Control Layers.
Tiled Upscaling via MultiDiffusion
MultiDiffusion
is a fairly straightforward technique for tiled denoising. The gist is similar to other tiled upscaling methods - split the input image up in to tiles, process each independently, and stitch them back together. The main innovation for MultiDiffusion
is to do this in latent space, blending the tensors together continually. This results in excellent consistency across the output image, with no seams.
This feature is exposed as a Tiled MultiDiffusion Denoise Latents
node, currently classified as a beta version. It works much the same as the OG Denoise Latents
node. Here's a workflow to get you started: sd15_multi_diffusion_esrgan_x2_upscale.json
We are still thinking about to expose this in the linear UI. Most likely, we expose this with very minimal settings. If you want to tweak it, use the workflow.
How to use it
This technique is fundamentally the same as normal img2img. Appropriate use of conditioning and control will greatly improve the output. The one hard requirement is to use the Tile ControlNet model.
Besides that, here are some tips from our initial testing:
- Use a detail-adding or style LoRAs.
- Use a base model best suited for the desired output style.
- Prompts make a difference.
- The initial upscaling method makes a difference.
- Scheduler makes a difference. Some produce softer outputs.
VRAM Usage
This technique can upscale images to very large sizes without substantially increasing VRAM usage beyond what you'd see for a "normal" sized generation. The VRAM bottlenecks then become the first VAE encode (Image to Latents
) and final VAE decode (Latents to Image
) steps.
You may run into OOM errors during these steps. The solution is to enable tiling using the toggle on the Image to Latents
and Latents to Image
nodes. This allows the VAE operations to be done piecewise, similar to the tiled denoising process, without using gobs of VRAM.
There's one caveat - VAE tiling often introduces inconsistency across tiles. Textures and colors may differ from tile to tile. This is a function of the diffusers handling of VAE tiling, not the tiled denoising process introduced in v4.2.5. We are investigating ways to improve this.
Takeaway: If your GPU can handle non-tiled VAE encode and decode for a given output size, use that for best results.
📈 Patch Nodes for v4.2.5
Enhancements
- When downloading image metadata, graphs or workflows, the JSON file includes the image name and type of data. Thanks @jstnlowe!
- Add
clear_queue_on_startup
config setting to clear problematic queues. This is useful for a rare edge case where your queue is full of items that somehow crash the app. Set this to true, and the queue will clear before it has time to attempt to execute the problematic item. Thanks @steffy-lo! - Performance and memory efficiency improvements for LoRA patching and model offloading.
- Addition of a simplified model installation methods to the Invocation API:
download_and_cache_model
,load_local_model
andload_remote_model
. These methods allow models to be used without needing them to be added to the model manager. For example, we are now using these methods to load ESRGAN models. - Support for probing and loading SDXL VAE checkpoint.
Fixes
- Fix handling handling of 0-step denoising process.
- If a control image's processed version is missing when the app loads, it is now re-processed.
Performance improvements
- Improved LoRA patching.
- Improved RAM <-> VRAM model transfer performance.
Internal changes
- The
DenoiseLatentsInvocation
has had its internal methods split up to support tiled upscaling viaMultiDiffusion
. This included some amount of file shuffling and renaming. Theinvokeai
package's exported classes should still be the same. Please let us know if this has broken an import for you.
💾 Installation and Updating
To install or update to v4.2.5, download the installer and follow the installation instructions.
To update, select the same installation location. Your user data (images, models, etc) will be retained.
Missing models after updating from v3 to v4
See this FAQ.
Error during installation ModuleNotFoundError: No module named 'controlnet_aux'
See this FAQ
What's Changed
- Prefixed JSON filenames with the image UUID by @jstnlowe in #6486
- feat(ui): control layers internals cleanup by @psychedelicious in #6487
- LoRA patching optimization by @lstein in #6439
- fix(ui): re-process control image if processed image is missing on page load by @psychedelicious in #6494
- Split up latent.py (code reorganization, no functional changes) by @RyanJDick in #6491
- Add simplified model manager install API to InvocationContext by @lstein in #6132
- fix: Some imports from previous PR's by @blessedcoolant in #6501
- Improve RAM<->VRAM memory copy performance in LoRA patching and elsewhere by @lstein in #6490
- Fix
DEFAULT_PRECISION
handling by @RyanJDick in #6492 - added route to install huggingface models from model marketplace by @chainchompa in #6515
- Model hash validator by @brandonrising in #6520
- Tidy
SilenceWarnings
context manager by @RyanJDick in #6493 - [#6333] Add clear_queue_on_startup config to clear problematic queues by @steffy-lo in #6502
- [MM] Add support for probing and loading SDXL VAE checkpoint files by @lstein in #6524
- Add
TiledMultiDiffusionDenoiseLatents
invocation (for upscaling workflows) by @RyanJDick in #6522 - Update prevention exception message by @hipsterusername in #6543
- Fix handling handling of 0-step denoising process by @RyanJDick in #6544
- chore: bump version v4.2.5 by @psychedelicious in #6547
New Contributors
Full Changelog: v4.2.4...v4.2.5