RTX 4090 benchmarks - FLUX model #4571
Replies: 5 comments 4 replies
-
RTX 4080 / torch==2.5.0.dev20240821+cu124 / python 3.12 / Ubuntu 24.04
|
Beta Was this translation helpful? Give feedback.
-
Flux in Q8_0 format looks very similar to FP16 ( better quality than FP8 ) and may be even faster. Probably will be even faster in the future, while generating ( #4538 (comment) ) https://github.com/city96/ComfyUI-GGUF |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Updating from pytorch 2.4.0+cu121 to latest 2.5.0.dev+cu124 boosted generation speed by about 10%. 1024x1024 20step now: With 2.52 GHz at 875mV, drawing ~290W: And 2.8 GHZ at 1000mV, drawing 400W: With pytorch 2.4 I was at about 1.9t/s and 260-270W with the heavily undervolted 2.52GHz clockspeed that I normally use. All above results are with FP8 --fast, but it seems GGUF Q8 got a similar ~10% speed bump with 2.5.0.dev+cu124. |
Beta Was this translation helpful? Give feedback.
-
3090 using 400W :( |
Beta Was this translation helpful? Give feedback.
-
The problem is that everyone has different configurations, and my ComfyUI setup was a mess. The FLUX model took a long time to load, but I was able to fix it.
My PC Specifications:
Processor: Intel i9-12900K @ 3.20 GHz
Memory: 64.0 GB (63.7 GB usable)
GPU: NVIDIA RTX 4090
Comfy log:
Goal:
I want to see if my setup is one of the fastest in the community, using the same workflows and models.
Workflow:
Workflow Screenshot:
My RTX 4090 Results:
I ran the process twice each time. The first run took longer due to loading models and other initializations:
1 run: Prompt executed in 41.14 seconds
2 etc... runs: Prompt executed in 18.42 seconds
Beta Was this translation helpful? Give feedback.
All reactions