Quantized Flux not working #2511

donkey-donkey · 2024-09-28T04:44:21Z

hi. gettin an error. on my ADA RTX 4000 machine that supports BF16 and that runs Stable Diffusion just fine. I get an error on the quantized FLUX update.

running with no model specified or dev or schnell

cargo run --features cuda,cudnn --example flux -r --  --height 1024 --width 1024     --prompt "a rusty robot walking on a beach holding a small torch, the robot has the word "rust" written on it, high quality, 4k" --model dev

error

Tensor[[1, 256], u32, cuda:0]
Error: DriverError(CUDA_ERROR_NOT_FOUND, "named symbol not found") when loading is_u32_bf16

any ideas?

The text was updated successfully, but these errors were encountered:

LaurentMazare · 2024-09-28T16:47:06Z

This error is most likely not due to the model itself but rather to the cuda setup. The bf16 kernels are predicated by the following line:

#if __CUDA_ARCH__ >= 800
...
#endif

This makes the kernels only available when the cuda arch set up by the nvcc compiler is above 8 so it's likely not the case in your setup. It would be interesting to see which value __CUDA_ARCH__ has in your case, as well as the output of the nvidia-smi --query-gpu=compute_cap --format=csv command.

donkey-donkey · 2024-09-30T02:03:48Z

this machine has 2 GPUs. When I run the Stable diffusion Examples it uses the ADA 4000 with a 8.9 compute cap.

$ nvidia-smi --query-gpu=compute_cap --format=csv
compute_cap
6.1
8.9

how do i see the value of CUDA_ARCH

LaurentMazare · 2024-09-30T07:25:36Z

That first gpu is most likely creating the issue, did you trying using CUDA_VISIBLE_DEVICES so that candle can only see the second gpu (if you're not familiar with it, it's not a candle specific thing so you can just google to find the way to use it).

super-fun-surf · 2024-09-30T18:39:52Z

When CUDA_VISIBLE_DEVICES is set to the correct device and i can see that the correct GPU is used in nvidia-smi -l 1 realtime monitoring it gets up to about 9GB of memory used and then the same error happens in the middle of the image process

    Running `target/release/examples/flux --height 1024 --width 1024 --prompt 'a rusty robot walking on a beach holding a small torch, the robot has the word rust written on it, high quality, 4k' --quantized`
[[    3,     9,     3,  9277,    63,  7567,  3214,    30,     3,     9,  2608,
   3609,     3,     9,   422, 26037,     6,     8,  7567,    65,     8,  1448,
      3,  9277,  1545,    30,    34,     6,   306,   463,     6,   314,   157,
      1,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0,     0,     0,     0,     0,     0,     0,     0,     0,
      0,     0,     0]]
Tensor[[1, 256], u32, cuda:0]
Error: DriverError(CUDA_ERROR_NOT_FOUND, "named symbol not found") when loading is_u32_bf16

LaurentMazare · 2024-09-30T18:58:53Z

Probably good to clean your target directory in case there are some PTX files that are cached and didn't rebuild after setting CUDA_VISIBLE_DEVICES.

super-fun-surf · 2024-10-01T16:13:33Z

I did a clean and its the same error.
I can monitor it in nvidia-smi which card is being used. The older card simply runs out of memory right away.

not sure where to look next.

LaurentMazare · 2024-10-01T17:12:33Z

Hum seems weird that candle can use the older card if CUDA_VISIBLE_DEVICES points only at the new one, it's supposed to be handled in the cuda framework and so not something that candle could bypass. Maybe you're pointing at the wrong device somehow?
Another option would be in the code to point at the cuda device 1 rather than the cuda device 0.

super-fun-surf · 2024-10-02T14:28:17Z

Actually it is pointing to the right card. it's using the correct card. CUDA_VISIBLE_DEVICES works as it should, there is no problem there. it's using the correct card and crashing.
again there is no problem in choosing the correct card. As I said above I can monitor it in nviida-smi and it is using the right card.
On the correct card it still crashes with the error:

Error: DriverError(CUDA_ERROR_NOT_FOUND, "named symbol not found") when loading is_u32_bf16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized Flux not working #2511

Quantized Flux not working #2511

donkey-donkey commented Sep 28, 2024

LaurentMazare commented Sep 28, 2024

donkey-donkey commented Sep 30, 2024

LaurentMazare commented Sep 30, 2024

super-fun-surf commented Sep 30, 2024 •

edited

Loading

LaurentMazare commented Sep 30, 2024

super-fun-surf commented Oct 1, 2024

LaurentMazare commented Oct 1, 2024

super-fun-surf commented Oct 2, 2024

Quantized Flux not working #2511

Quantized Flux not working #2511

Comments

donkey-donkey commented Sep 28, 2024

LaurentMazare commented Sep 28, 2024

donkey-donkey commented Sep 30, 2024

LaurentMazare commented Sep 30, 2024

super-fun-surf commented Sep 30, 2024 • edited Loading

LaurentMazare commented Sep 30, 2024

super-fun-surf commented Oct 1, 2024

LaurentMazare commented Oct 1, 2024

super-fun-surf commented Oct 2, 2024

super-fun-surf commented Sep 30, 2024 •

edited

Loading