Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Run Genesis on AMD Device with GPU Backend #225

Open
Burnling-gx opened this issue Dec 22, 2024 · 4 comments
Open

Unable to Run Genesis on AMD Device with GPU Backend #225

Burnling-gx opened this issue Dec 22, 2024 · 4 comments

Comments

@Burnling-gx
Copy link

I attempted to run the following code:

import os
os.environ['PYOPENGL_PLATFORM'] = 'glx'
 
import genesis as gs

print(gs.get_gpu_device())
gs.init(backend=gs.gpu)

scene = gs.Scene(show_viewer=True)
plane = scene.add_entity(gs.morphs.Plane())
franka = scene.add_entity(
    gs.morphs.MJCF(file='xml/franka_emika_panda/panda.xml'),
)

scene.build()

for i in range(1000):
    scene.step()

The output I received is as follows:

(device(type='cuda'), 'AMD Radeon Graphics', 7.464061737060547)
[Genesis] [10:19:01] [INFO] ╭─────────────────────────────────────────────────────────────────────────────────────╮
[Genesis] [10:19:01] [INFO] │┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉ Genesis ┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉│
[Genesis] [10:19:01] [INFO] ╰─────────────────────────────────────────────────────────────────────────────────────╯
[Genesis] [10:19:01] [INFO] Running on [AMD Radeon Graphics] with backend gs.cuda. Device memory: 7.46 GB.
[W 12/22/24 10:19:01.532 18257] [cuda_driver.cpp:load_lib@36] libcuda.so lib not found.
[W 12/22/24 10:19:01.532 18257] [misc.py:adaptive_arch_select@758] Arch=[<Arch.cuda: 3>] is not supported, falling back to CPU
[Genesis] [10:19:01] [INFO] 🚀 Genesis initialized. 🔖 version: 0.2.0, 🌱 seed: None, 📏 precision: '32', 🐛 debug: False, 🎨 theme: 'dark'.
[Genesis] [10:19:01] [INFO] Scene <679c1fa> created.
[Genesis] [10:19:01] [INFO] Adding <gs.RigidEntity>. idx: 0, uid: <95b9c77>, morph: <gs.morphs.Plane>, material: <gs.materials.Rigid>.
[Genesis] [10:19:01] [INFO] Adding <gs.RigidEntity>. idx: 1, uid: <6e9f60f>, morph: <gs.morphs.MJCF(file='/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/assets/xml/franka_emika_panda/panda.xml')>, material: <gs.materials.Rigid>.
[Genesis] [10:19:02] [INFO] Building scene <679c1fa>...
Traceback (most recent call last):
  File "/home/burnling/Code/Genesis Learning/hello_genesis.py", line 15, in <module>
    scene.build()
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/utils/misc.py", line 38, in wrapper
    return method(self, *args, **kwargs)
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/engine/scene.py", line 539, in build
    self._parallelize(n_envs, env_spacing, n_envs_per_row, center_envs_at_origin)
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/engine/scene.py", line 576, in _parallelize
    self._envs_idx = torch.arange(self._B, dtype=gs.tc_int, device=gs.device)
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

[Genesis] [10:19:02] [INFO] 💤 Exiting Genesis and caching compiled kernels...

I noticed the device(type='cuda') in the output. I also tried switching the backend to gs.cuda explicitly, but the same issue occurred.

Upon reviewing the constants.py source code, I found this mapping

"Linux": {
        backend.cpu: backend.cpu,
        backend.gpu: backend.cuda,
        backend.cuda: backend.cuda,
        backend.vulkan: backend.vulkan,
    },

Based on this, it seems that backend.gpu defaults to backend.cuda, which might not be suitable for AMD GPUs.

How can I properly run Genesis on an AMD GPU? Is there a way to configure the backend for HIP or ROCm instead of CUDA?

@zhouxian
Copy link
Collaborator

could you try gs.vulkan?

@Burnling-gx
Copy link
Author

could you try gs.vulkan?

I tried.

(device(type='cuda'), 'AMD Radeon Graphics', 7.464061737060547)
[Genesis] [19:02:07] [INFO] ╭─────────────────────────────────────────────────────────────────────────────────────╮
[Genesis] [19:02:07] [INFO] │┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉ Genesis ┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉┈┉│
[Genesis] [19:02:07] [INFO] ╰─────────────────────────────────────────────────────────────────────────────────────╯
[Genesis] [19:02:07] [INFO] Running on [AMD Radeon Graphics] with backend gs.vulkan. Device memory: 7.46 GB.
[Genesis] [19:02:07] [INFO] 🚀 Genesis initialized. 🔖 version: 0.2.0, 🌱 seed: None, 📏 precision: '32', 🐛 debug: False, 🎨 theme: 'dark'.
[Genesis] [19:02:07] [INFO] Scene <86327d2> created.
[Genesis] [19:02:07] [INFO] Adding <gs.RigidEntity>. idx: 0, uid: <4978ca6>, morph: <gs.morphs.Plane>, material: <gs.materials.Rigid>.
[Genesis] [19:02:07] [INFO] Adding <gs.RigidEntity>. idx: 1, uid: <7bbec7f>, morph: <gs.morphs.MJCF(file='/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/assets/xml/franka_emika_panda/panda.xml')>, material: <gs.materials.Rigid>.
[Genesis] [19:02:08] [INFO] Building scene <86327d2>...
Traceback (most recent call last):
  File "/home/burnling/Code/Genesis Learning/hello_genesis.py", line 15, in <module>
    scene.build()
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/utils/misc.py", line 38, in wrapper
    return method(self, *args, **kwargs)
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/engine/scene.py", line 539, in build
    self._parallelize(n_envs, env_spacing, n_envs_per_row, center_envs_at_origin)
  File "/home/burnling/anaconda3/envs/genesis/lib/python3.10/site-packages/genesis/engine/scene.py", line 576, in _parallelize
    self._envs_idx = torch.arange(self._B, dtype=gs.tc_int, device=gs.device)
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

@Burnling-gx
Copy link
Author

After reviewing this issue, I was able to resolve the problem. Since I am using a gfx1103 GPU (which I confirmed by running rocminfo), I found that setting the environment variable HSA_OVERRIDE_GFX_VERSION to 11.0.0 worked for me.

To apply this fix, I used the following command in my terminal:

export HSA_OVERRIDE_GFX_VERSION=11.0.0

This resolved the HIP error: invalid device function issue, and Genesis now runs correctly with gs.vulkan on my AMD device.

@Burnling-gx
Copy link
Author

btw, I still get the following warnings when i work with gs.gpu:

[W 12/22/24 19:25:19.692 11951] [cuda_driver.cpp:load_lib@36] libcuda.so lib not found.
[W 12/22/24 19:25:19.692 11951] [misc.py:adaptive_arch_select@758] Arch=[<Arch.cuda: 3>] is not supported, falling back to CPU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants