Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't invoke builtin kernels with a period in their name #777

Closed
jansol opened this issue Aug 12, 2024 · 5 comments · Fixed by #779
Closed

Can't invoke builtin kernels with a period in their name #777

jansol opened this issue Aug 12, 2024 · 5 comments · Fixed by #779
Labels

Comments

@jansol
Copy link

jansol commented Aug 12, 2024

Describe the bug
It is possible to create a program with builtin kernels that have periods in their names, but it does not seem to be possible to actually invoke them.

To Reproduce
Steps to reproduce the behavior:

  1. Install PoCL from your package manager. (or you can build it from sources and point the ICD loader at it with OCL_ICD_VENDORS=/path/to/pocl/build/ocl-vendors/pocl-tests.icd and POCL_BUILDING=1 - the latter is needed so that PoCL searches for things in the build directory instead of the usual system paths)
  2. Verify with OCL_ICD_VENDORS=... POCL_BUILDING=1 clinfo that the Portable Computing Language platform shows up and lists at least one CPU device (if the platform shows but is empty and you are building from sources you probably forgot the POCL_BUILDING=1)
  3. Confirm that clinfo says there are builtin kernels in the CPU device and that pocl.add.i8 is one of them
  4. Run The vector addition example with PoCL to confirm it works as expected
  5. Change the numpy array data type from float32 to int8
  6. Replace the OpenCL C kernel with the PoCL builtin one:
bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
print(bik_prg.all_kernels())
  1. Observe:
<snip>
  File "<pyopencl invoker for 'pocl.add.i8'>", line 6
    def enqueue_knl_pocl.add.i8(self, queue, global_size, local_size, arg0, arg1, arg2, global_offset=None, g_times_l=False, allow_empty_ndrange=False, wait_for=None):
                        ^
SyntaxError: expected '('

Expected behavior
The example should run identically regardless of whether the OpenCL C kernel or the builtin kernel is used

Environment (please complete the following information):

  • OS: ubuntu 24.04
  • ICD Loader and version: ocl-icd 2.3.2
  • ICD and version: PoCL 6.1-pre main-0-g381550927
  • CPU/GPU: AMD Ryzen 5 1600
  • Python version: 3.12.3
  • PyOpenCL version: 2024.2.7
@jansol jansol added the bug label Aug 12, 2024
@jansol
Copy link
Author

jansol commented Aug 21, 2024

This seems like it should be relatively straightforward to fix in the codegen by mangling the kernel name and e.g. converting all invalid characters to underscores. I can give it a shot, but have no idea where in the code to even begin looking for this.

And FWIW, dots in kernel names do have prior art from Khronos themselves e.g. in OpenVX: org.khronos.openvx.sobel_3x3 (although that is not OpenCL-specific, but an OpenCL implementation should stick to the official names of the kernels as defined in the spec)

@inducer
Copy link
Owner

inducer commented Aug 23, 2024

Thanks for the report. Could you take a look at #779 and let me know if this addresses the issue?

inducer added a commit that referenced this issue Aug 23, 2024
@inducer
Copy link
Owner

inducer commented Aug 23, 2024

While you're at it, could you put together a short test? pocl-specific is OK, just make sure to skip if the ICD is not pocl.

@jansol
Copy link
Author

jansol commented Aug 26, 2024

Here's the snippet I've adapted from the example in the documentation:

#!/usr/bin/env python3

#from dotenv import load_dotenv
import numpy as np
import pyopencl as cl

#load_dotenv() # Set up PoCL environment variables from .env file

a_np = np.array([1,2,3,4], dtype=np.int8)
b_np = np.array([1,2,3,4], dtype=np.int8)

platforms = cl.get_platforms()
platform_idx = -1
for i,p in enumerate(platforms):
    if p.get_info(cl.platform_info.NAME) == "Portable Computing Language":
        platform_idx = i
if platform_idx < 0:
    print("No PoCL platform found, skipping")
    exit()
devices = platforms[platform_idx].get_devices()
ctx = cl.Context([devices[0]])
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)


#%% OpenCL C kernel

prg = cl.Program(ctx, """
__kernel void sum(
    __global const char *a_g, __global const char *b_g, __global char *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()

res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
knl = prg.sum  # Use this kernel object for repeated calls
knl(queue, a_np.shape, None, a_g, b_g, res_g)


#%% Built-in kernel

bikres_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)

bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
for k in bik_prg.all_kernels():
    name = k.get_info(cl.kernel_info.FUNCTION_NAME)
    if name == "pocl.add.i8":
        bik_knl = k

bik_knl(queue, a_np.shape, None, a_g, b_g, bikres_g)

#%% Fetch results

res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g, wait_for=[], is_blocking=False)

bikres_np = np.empty_like(a_np)
cl.enqueue_copy(queue, bikres_np, res_g, wait_for=[], is_blocking=False)
queue.finish()

print(f"Values: {a_np} + {b_np} = {res_np} = {bikres_np}")
assert np.allclose(res_np, a_np + b_np)
assert np.allclose(bikres_np, a_np + b_np)

@jansol
Copy link
Author

jansol commented Aug 27, 2024

Okay, finally found a good time to check this locally. #779 does indeed make it possible to access the problematic kernels at all, although it is a bit on the awkward side:

bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
for k in bik_prg.all_kernels():
    name = k.get_info(cl.kernel_info.FUNCTION_NAME)
    if name == "pocl.add.i8":
        bik_knl = k
bik_knl(queue, a_np.shape, None, a_g, b_g, bikres_g)

Fetching the kernel via the kernel_name attribute is not possible since it gets internally mangled to pocladdi8 but it seems kernel_name gets passed as is to clCreateKernel which of course fails in this case. Something like an __getitem__(self, kernel_name) or a helper function to fetch kernels by (string) name would be nice.

But for the time being this already unblocks me, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants