XLA run out of memory with Bumblebee example #1093

IleanaAldama · 2023-02-11T19:23:06Z

Hi I'm running the default smartcell that uses stable diffusion but is runs out of memory

errors messages

13:16:24.741 [info] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

13:16:24.743 [info] XLA service 0x7fa98c295b50 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

13:16:24.743 [info]   StreamExecutor device (0): NVIDIA GeForce RTX 2080 Ti, Compute Capability 7.5

13:16:24.743 [info] Using BFC allocator.

13:16:24.743 [info] XLA backend allocating 10251200102 bytes on device 0 for BFCAllocator.

13:21:27.502 [warning] ****************************************************************************************************

13:21:27.502 [error] Execution of replica 0 failed: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 117964800 bytes.
BufferAssignment OOM Debugging.
BufferAssignment stats:
             parameter allocation:  112.50MiB
              constant allocation:         0B
        maybe_live_out allocation:  112.50MiB
     preallocated temp allocation:         0B
                 total allocation:  225.00MiB
              total fragmentation:         0B (0.00%)
Peak buffers:
	Buffer 1:
		Size: 112.50MiB
		Entry Parameter Subshape: f32[29491200]
		==========================

	Buffer 2:
		Size: 112.50MiB
		XLA Label: copy
		Shape: f32[1280,2560,3,3]
		==========================

	Buffer 3:
		Size: 8B
		XLA Label: tuple
		Shape: (f32[1280,2560,3,3])
		==========================

system info:

ileana@Potato ~ $ tail ~/.zshrc

export XLA_TARGET=cuda118
export XLA_BUILD=true
export TF_CUDA_PATHS=/opt/cuda,/usr

. /opt/asdf-vm/asdf.sh
ileana@Potato ~ $ nvidia-smi 
Sat Feb 11 13:22:18 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.56.06    Driver Version: 520.56.06    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:3F:00.0 Off |                  N/A |
| 31%   41C    P8    12W / 250W |  10377MiB / 11264MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     81092      C   ...25/.mix/escripts/livebook    10374MiB |
+-----------------------------------------------------------------------------+
ileana@Potato ~ $ pacman -Qi cuda      
Name            : cuda
Version         : 11.8.0-1
Description     : NVIDIA's GPU programming toolkit
Architecture    : x86_64
URL             : https://developer.nvidia.com/cuda-zone
Licenses        : custom:NVIDIA
Groups          : None
Provides        : cuda-toolkit  cuda-sdk  libcudart.so=11.0-64  libcublas.so=11-64
                  libcublas.so=11-64  libcusolver.so=11-64  libcusolver.so=11-64
                  libcusparse.so=11-64  libcusparse.so=11-64
Depends On      : gcc11  opencl-nvidia  nvidia-utils  python
Optional Deps   : gdb: for cuda-gdb [installed]
                  glu: required for some profiling tools in CUPTI [installed]
Required By     : cuda-tools  cudnn
Optional For    : foldingathome  openmpi
Conflicts With  : None
Replaces        : cuda-toolkit  cuda-sdk  cuda-static
Installed Size  : 5.13 GiB
Packager        : Sven-Hendrik Haase <[email protected]>
Build Date      : Wed 05 Oct 2022 01:19:26 PM CDT
Install Date    : Sat 11 Feb 2023 11:59:55 AM CST
Install Reason  : Explicitly installed
Install Script  : Yes
Validated By    : Signature

The text was updated successfully, but these errors were encountered:

josevalim · 2023-02-11T19:45:32Z

Yes, we are aiming to improve it here: elixir-nx/bumblebee#147 :)

josevalim closed this as completed Feb 11, 2023

wrgoldstein mentioned this issue Apr 24, 2023

Following fine_tuning.livemd results in OOM on decent hardware elixir-nx/bumblebee#203

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLA run out of memory with Bumblebee example #1093

XLA run out of memory with Bumblebee example #1093

IleanaAldama commented Feb 11, 2023

josevalim commented Feb 11, 2023

XLA run out of memory with Bumblebee example #1093

XLA run out of memory with Bumblebee example #1093

Comments

IleanaAldama commented Feb 11, 2023

josevalim commented Feb 11, 2023