-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't invoke builtin kernels with a period in their name #777
Comments
This seems like it should be relatively straightforward to fix in the codegen by mangling the kernel name and e.g. converting all invalid characters to underscores. I can give it a shot, but have no idea where in the code to even begin looking for this. And FWIW, dots in kernel names do have prior art from Khronos themselves e.g. in OpenVX: |
Thanks for the report. Could you take a look at #779 and let me know if this addresses the issue? |
While you're at it, could you put together a short test? pocl-specific is OK, just make sure to skip if the ICD is not pocl. |
Here's the snippet I've adapted from the example in the documentation: #!/usr/bin/env python3
#from dotenv import load_dotenv
import numpy as np
import pyopencl as cl
#load_dotenv() # Set up PoCL environment variables from .env file
a_np = np.array([1,2,3,4], dtype=np.int8)
b_np = np.array([1,2,3,4], dtype=np.int8)
platforms = cl.get_platforms()
platform_idx = -1
for i,p in enumerate(platforms):
if p.get_info(cl.platform_info.NAME) == "Portable Computing Language":
platform_idx = i
if platform_idx < 0:
print("No PoCL platform found, skipping")
exit()
devices = platforms[platform_idx].get_devices()
ctx = cl.Context([devices[0]])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
b_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_np)
#%% OpenCL C kernel
prg = cl.Program(ctx, """
__kernel void sum(
__global const char *a_g, __global const char *b_g, __global char *res_g)
{
int gid = get_global_id(0);
res_g[gid] = a_g[gid] + b_g[gid];
}
""").build()
res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
knl = prg.sum # Use this kernel object for repeated calls
knl(queue, a_np.shape, None, a_g, b_g, res_g)
#%% Built-in kernel
bikres_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
for k in bik_prg.all_kernels():
name = k.get_info(cl.kernel_info.FUNCTION_NAME)
if name == "pocl.add.i8":
bik_knl = k
bik_knl(queue, a_np.shape, None, a_g, b_g, bikres_g)
#%% Fetch results
res_np = np.empty_like(a_np)
cl.enqueue_copy(queue, res_np, res_g, wait_for=[], is_blocking=False)
bikres_np = np.empty_like(a_np)
cl.enqueue_copy(queue, bikres_np, res_g, wait_for=[], is_blocking=False)
queue.finish()
print(f"Values: {a_np} + {b_np} = {res_np} = {bikres_np}")
assert np.allclose(res_np, a_np + b_np)
assert np.allclose(bikres_np, a_np + b_np) |
Okay, finally found a good time to check this locally. #779 does indeed make it possible to access the problematic kernels at all, although it is a bit on the awkward side: bik_prg = cl.create_program_with_built_in_kernels(ctx, [devices[0]], ["pocl.add.i8"])
bik_prg.build()
for k in bik_prg.all_kernels():
name = k.get_info(cl.kernel_info.FUNCTION_NAME)
if name == "pocl.add.i8":
bik_knl = k
bik_knl(queue, a_np.shape, None, a_g, b_g, bikres_g) Fetching the kernel via the But for the time being this already unblocks me, thanks! |
Describe the bug
It is possible to create a program with builtin kernels that have periods in their names, but it does not seem to be possible to actually invoke them.
To Reproduce
Steps to reproduce the behavior:
OCL_ICD_VENDORS=/path/to/pocl/build/ocl-vendors/pocl-tests.icd
andPOCL_BUILDING=1
- the latter is needed so that PoCL searches for things in the build directory instead of the usual system paths)OCL_ICD_VENDORS=... POCL_BUILDING=1 clinfo
that thePortable Computing Language
platform shows up and lists at least one CPU device (if the platform shows but is empty and you are building from sources you probably forgot the POCL_BUILDING=1)pocl.add.i8
is one of themExpected behavior
The example should run identically regardless of whether the OpenCL C kernel or the builtin kernel is used
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: