Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newcomer questions about the project #397

Open
lmtss opened this issue Jan 6, 2025 · 3 comments
Open

Newcomer questions about the project #397

lmtss opened this issue Jan 6, 2025 · 3 comments

Comments

@lmtss
Copy link

lmtss commented Jan 6, 2025

Hi! I think this project is amazing and really cool. I've made an effort to go through the code and examples, but there are still some parts I don't fully understand.

How to pass constant variables at runtime instead of compile-time?

For example, in Vulkan, I would use push constants. However, I noticed that the CubeCL wgpu backend seems to not use push constants (it looks like it defaults to passing empty data).
But in the reduce_kernel example, it seems that scalar parameters can be passed to the kernel. How is this achieved in the backend?

reduce_kernel::launch_unchecked::<In, Out, Rd, Run>(
    client,
    config.cube_count,
    config.cube_dim,
    input.as_tensor_arg(config.line_size as u8),
    output.as_tensor_arg(1),
    ScalarArg::new(axis),
    settings,
);

How to save the compilation result of a kernel?

I think CubeCL's compilation approach is really cool, but runtime compilation might cause stuttering. Is it possible to cache the compilation results to a file, similar to how PSO (Pipeline State Object) caching is commonly done in rendering?

@nathanielsimard
Copy link
Member

How to pass constant variables at runtime instead of compile-time?

#[cube(launch)]
fn arg_runtime(arg: u32) {
}

#[cube(launch)]
fn arg_comptime(#[comptime] arg: u32) {
}

How to save the compilation result of a kernel?

Yes it is planned, but not yet a priority, feel free to submit a PR if you want to work on this.

@lmtss
Copy link
Author

lmtss commented Jan 9, 2025

Thank you for your reply.
I also wanted to mention that, based on my recent experience, I find it a bit inconvenient to specify the Runtime type when launching a kernel. My intuition is that the Runtime type should ideally only be specified when creating the Device.
Is this design intentional for a specific reason, or might there be plans to optimize this in the future?

@nathanielsimard
Copy link
Member

@lmtss The device doesn't work like in wgpu; it doesn't contain the state or anything like that. It's simply an identifier for where you want to execute your kernels. The client is where you can actually call functions, but the client has a type dependency on the runtime.

We could extract the client methods into a trait, passing around something like Box<dyn CubeClient> instead. That would actually remove the Runtime trait bound from a lot of places, so yeah that could be something to work on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants