AMD: parse the architecture as supplied by gcnArchName #11244

Haus1 · 2025-01-14T21:58:14Z

The value provided by minor is truncated for AMD so parse the value returned by gcnArchName for an accurate ID.

We can also use the common value for GCN4, gfx800, to avoid missing compatible devices.

This is a follow-up to #11209 and will change the behavior of CDNA3, CDNA, VEGA and GCN4 as they should now be recognized as expected. Of those I only have access to a GCN4 device for testing.

JohannesGaessler · 2025-01-16T11:36:10Z

I don't know at all whether this is the correct way to do it. @IMbackK your input would be appreciated.

IMbackK · 2025-01-16T13:50:00Z

Yes this is more correct, the current code misses the arch step part.
But i have never seen a device report gfx800, rocblas only supports and checks for gfx803, which is reported by fiji and all polaris variants, and the only other variant i am aware of is gfx802, which is not supported by rocblas (or any rocm component).

Thus nak on the change to the gfx8 define.
I will try this pr out on my devices (i have access to gfx803, gfx900, gfx906, gfx908 and gfx1030)

IMbackK · 2025-01-16T14:37:42Z

Theres also a snag in this pr regarding gfx90a, gfx90a reports 9.1 as major minor but its gcnArchName is gfx90a which this pr wont parse correctly, same goes for others like gfx90c.

So the current code is not correct, but this pr has too many issues to serve as an improvement as is.

Haus1 · 2025-01-16T15:08:52Z

It appears this returns the full target ID as defined in https://github.com/ROCm/clr/blob/amd-staging/rocclr/device/device.cpp around line 125. This'll need to be expanded upon in order to parse out xnack status and to handle the addition of generics.

If it were possible to retrieve the version stepping directly that would be preferable to parsing it out of a string. Would the xnack status be of any use here or can that just be ignored?

IMbackK · 2025-01-16T15:23:42Z

xnack can be ignored since we dont use hipMallocManaged allocated memory. Outside of the user recompileing the whole rocm stack with non default flags only gfx942 and gfx90a can end up in xnak+ mode.

Haus1 · 2025-01-16T16:02:39Z

Yeah, they certainly don't make enabling xnack easy. On linux the kernel module also needs patched to prevent it from rejecting the device

Haus1 · 2025-01-17T23:18:31Z

This will now work with all the IDs AMD has in staging and will gracefully fall back to the old way if it fails. Please let me know if I've missed anything.

Would it be better to submit backend changes like this to ggml first?

The value provided by minor is truncated for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID. We can also use the common value for GCN4, as gfx800, to avoid missing compatible devices.

github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Jan 14, 2025

Haus1 force-pushed the amd-rework-version branch from 7bd1195 to 468296f Compare January 17, 2025 22:47

github-actions bot added script Script related testing Everything test related python python script changes labels Jan 17, 2025

Haus1 force-pushed the amd-rework-version branch from 468296f to 9620bce Compare January 17, 2025 22:54

AMD: parse the architecture as supplied by gcnArchName

f77ea24

The value provided by minor is truncated for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID. We can also use the common value for GCN4, as gfx800, to avoid missing compatible devices.

Haus1 force-pushed the amd-rework-version branch from 9620bce to f77ea24 Compare January 18, 2025 21:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD: parse the architecture as supplied by gcnArchName #11244

AMD: parse the architecture as supplied by gcnArchName #11244

Haus1 commented Jan 14, 2025

JohannesGaessler commented Jan 16, 2025

IMbackK commented Jan 16, 2025 •

edited

Loading

IMbackK commented Jan 16, 2025 •

edited

Loading

Haus1 commented Jan 16, 2025

IMbackK commented Jan 16, 2025

Haus1 commented Jan 16, 2025

Haus1 commented Jan 17, 2025

AMD: parse the architecture as supplied by gcnArchName #11244

Are you sure you want to change the base?

AMD: parse the architecture as supplied by gcnArchName #11244

Conversation

Haus1 commented Jan 14, 2025

JohannesGaessler commented Jan 16, 2025

IMbackK commented Jan 16, 2025 • edited Loading

IMbackK commented Jan 16, 2025 • edited Loading

Haus1 commented Jan 16, 2025

IMbackK commented Jan 16, 2025

Haus1 commented Jan 16, 2025

Haus1 commented Jan 17, 2025

IMbackK commented Jan 16, 2025 •

edited

Loading

IMbackK commented Jan 16, 2025 •

edited

Loading