Newcomer questions about the project #397

lmtss · 2025-01-06T07:25:05Z

Hi! I think this project is amazing and really cool. I've made an effort to go through the code and examples, but there are still some parts I don't fully understand.

How to pass constant variables at runtime instead of compile-time?

For example, in Vulkan, I would use push constants. However, I noticed that the CubeCL wgpu backend seems to not use push constants (it looks like it defaults to passing empty data).
But in the reduce_kernel example, it seems that scalar parameters can be passed to the kernel. How is this achieved in the backend?

reduce_kernel::launch_unchecked::<In, Out, Rd, Run>(
    client,
    config.cube_count,
    config.cube_dim,
    input.as_tensor_arg(config.line_size as u8),
    output.as_tensor_arg(1),
    ScalarArg::new(axis),
    settings,
);

How to save the compilation result of a kernel?

I think CubeCL's compilation approach is really cool, but runtime compilation might cause stuttering. Is it possible to cache the compilation results to a file, similar to how PSO (Pipeline State Object) caching is commonly done in rendering?

The text was updated successfully, but these errors were encountered:

nathanielsimard · 2025-01-07T20:46:23Z

How to pass constant variables at runtime instead of compile-time?

#[cube(launch)]
fn arg_runtime(arg: u32) {
}

#[cube(launch)]
fn arg_comptime(#[comptime] arg: u32) {
}

How to save the compilation result of a kernel?

Yes it is planned, but not yet a priority, feel free to submit a PR if you want to work on this.

lmtss · 2025-01-09T16:33:03Z

Thank you for your reply.
I also wanted to mention that, based on my recent experience, I find it a bit inconvenient to specify the Runtime type when launching a kernel. My intuition is that the Runtime type should ideally only be specified when creating the Device.
Is this design intentional for a specific reason, or might there be plans to optimize this in the future?

nathanielsimard · 2025-01-11T17:07:32Z

@lmtss The device doesn't work like in wgpu; it doesn't contain the state or anything like that. It's simply an identifier for where you want to execute your kernels. The client is where you can actually call functions, but the client has a type dependency on the runtime.

We could extract the client methods into a trait, passing around something like Box<dyn CubeClient> instead. That would actually remove the Runtime trait bound from a lot of places, so yeah that could be something to work on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Newcomer questions about the project #397

Newcomer questions about the project #397

lmtss commented Jan 6, 2025

nathanielsimard commented Jan 7, 2025

lmtss commented Jan 9, 2025

nathanielsimard commented Jan 11, 2025

Newcomer questions about the project #397

Newcomer questions about the project #397

Comments

lmtss commented Jan 6, 2025

How to pass constant variables at runtime instead of compile-time?

How to save the compilation result of a kernel?

nathanielsimard commented Jan 7, 2025

lmtss commented Jan 9, 2025

nathanielsimard commented Jan 11, 2025