-
-
Notifications
You must be signed in to change notification settings - Fork 437
FLUX
FLUX.1 family consists of 3 variations:
-
Pro
Model weights are NOT released, model is available only via Black Forest Labs -
Dev
Open-weight, guidance-distilled from Pro variation, available for non-commercial applications -
Schnell
Open-weight, timestep-distilled from Dev variation, available under Apache2.0 license
Additionally SD.Next includes pre-quantized variations of FLUX.1 Dev variation: qint8
, qint4
and nf4
To use either any variations or quantizations, simply select it from Networks -> Reference
and model will be auto-downloaded on first use
Notes:
- FLUX.1 Dev variant is a gated model, you need to accept the terms and conditions to use it
- Do not download any of the base model manually, use built-in downloader!
Tip
- Pick variant that uses less memory as model in original form has very high requirements
- Set appropriate offloading setting before loading the model to avoid out-of-memory errors
- FLUX.1 is based on Flow-matching scheduling, only supported sampler is Euler Flow Match (Default)
Setting any other sampler will be ignored - Use of FLUX.1 LoRAs is included with limited support
Not all LoRAs are supported with more variations coming soon - FLUX.1 VAE does not support FP16, it is recommended to use BF16 if you have a compatible GPU
Otherwise, VAE will be upcast to FP32 which takes more memory and time - To enable image previews during generate, set
Settings -> Live Preview -> Method to TAESD - To further speed up generation, you can disable "full quality"
which triggers use of TAESD instead of full VAE to decode final image - To use prompt attention syntax with FLUX.1, set
Settings -> Execution -> Prompt attention to xhinker
Quantization can significantly reduce memory requirements, but it can also slightly reduce quality of outputs
Also, different quantization options are very platform and GPU dependent and are not supported on all platforms
-
qint8
andqint8
quantization requireoptimum-quanto
which will be auto-installed on first use
note: qint quantization requirestorch==2.4.0
note: is not compatible with balanced offload -
nf4
quantization requiresbitsandbytes
which will be auto-installed on first use
note:bitsandbytes
package is not compatible with all platforms and gpus
Example image with both dev and schnell variations and different transformer quantization options
FLUX.1 is a massive model at ~32GB and as such it is recommended to use offloading
To set offloading, see Settings -> Diffusers -> Model offload mode:
- Recommended for compatible high VRAM GPUs: Balanced
Faster but requires compatible platform and sufficient VRAM
Not compatible with Quanto qint quantization - Recommended for low VRAM GPUs: Sequential
Much slower but allows FLUX.1 to run on GPUs with 6GB VRAM
Not compatible with Quanto qint or BitsAndBytes nf4 quantization
Performance and memory usage of different FLUX.1 variations:
dtype | time (sec) | performance | memory | offload | note |
---|---|---|---|---|---|
bf16 | >32 GB | none | *1 | ||
bf16 | 50.47 | 0.40 it/s | balanced | *2 | |
bf16 | 94.28 | 0.21 it/s | 1.89 GB | sequential | |
nf4 | 14.69 | 1.36 it/s | 17.92 GB | none | |
nf4 | 21.02 | 0.95 it/s | balanced | *2 | |
nf4 | sequential | *3 | |||
qint8 | 15.42 | 1.30 it/s | 18.85 GB | none | |
qint8 | balanced | *4 | |||
qint8 | sequential | *5 | |||
qint4 | 18.37 | 1.09 it/s | 11.38 GB | none | |
qint4 | balanced | *4 | |||
qint4 | sequential | *5 |
Notes:
- *1: Memory usage exceeeds 32GB and is not recommended
- *2: Balanced offload VRAM usage is not included since it depends on desired threshold
- *3: BitsAndBytes nf4 quantization is not compatible with sequential offload
Error: Blockwise quantization only supports 16/32-bit floats
- *4: Quanto qint quantization is not compatible with balanced offload
Error: QBytesTensor.new() missing 5 required positional arguments
- *5: Quanto qint quantization is not compatible with sequential offload
Error: Expected all tensors to be on the same device
There are already many FLUX.1 unofficial variations available
Any Diffuser-based variation can be downloaded and loaded into SD.Next using Models -> Huggingface -> Download
For example, interesting variation is a merge of Dev and Schnell variations by sayakpaul: sayakpaul/FLUX.1-merged
SD.Next includes support for FLUX.1 LoRAs
Since LoRA keys vary singnificantly between tools used to train LoRA as well as LoRA types,
support for additional LoRAs will be added as needed - please report any non-functional LoRAs!
Loading of single-file safetensors is experimental:
- Supported for transformer (otherwise known as UNet) part of the FLUX.1 model only!
- Safetensors that contain full model with VAE and text-encoder are not supported at the moment and will be added in the future
- Safetensors in pre-quantized format are not supported at the moment and will be added in the future
To load a Unet safetensors file:
- Download safetensors file from desired source and place it in
models/UNET
folder
example: FastFlux Unchained - Load FLUX.1 model as usual and then
- Replace transformer with one in desired safetensors file using:
Settings -> Execution & Models -> UNet
Tip
For convience, you can add that setting to your quicksettings by adding Settings -> User Interface -> Quicksettings list -> sd_unet
SD.Next allows changing optional text encoder on-the-fly
Go to Settings -> Models -> Text encoder and select the desired text encoder
T5 enhances text rendering and some details, but its otherwise very lightly used and optional
Loading lighter T5 will greatly decrease model resource usage, but may not be compatible with all offloading modes
Example image with different encoder quantization options
Tip
If you want to frequently switch between text encoders, you can add that setting to quicksettings by adding Settings -> User Interface -> Quicksettings list -> sd_text_encoder
SD.Next allows changing VAE model used by FLUX.1 on-the-fly
There are no alternative VAE models released, so this setting is mostly for future use
Tip
If you want to frequently switch between text encoders, you can add that setting to quicksettings by adding Settings -> User Interface -> Quicksettings list -> sd_vae
Additional core support will be added in diffusers==0.31
and subsequently included in SD.Next:
- Additional LoRAs
- Additional loading of individual safetensors: https://github.com/huggingface/diffusers/pull/9244 https://github.com/huggingface/diffusers/pull/9243
- Diffusers generic quantization: https://github.com/huggingface/diffusers/issues/9174
- IP-Adapter: https://huggingface.co/XLabs-AI/flux-ip-adapter
- ControlNet: https://github.com/huggingface/diffusers/pull/9175 https://github.com/huggingface/diffusers/issues/9301
- Inpaint and Img2Img: https://github.com/huggingface/diffusers/pull/9135
- Differential diffusion: https://github.com/huggingface/diffusers/pull/9268
- Additional schedulers
- GGUF support
- FP8 quantization
© SD.Next