Support for CUDA Streams #1751

michaeleisel · 2024-02-23T20:41:13Z

I'm looking to leverage more of my GPU when running multiple models in parallel. It'd be great if candle had some sort of support for running multiple concurrent streams at once, whether through changing the stream used internally to CUDA's per-thread default stream, or allowing the user to run closures in different streams (with_stream(|| { })), or something else.

The text was updated successfully, but these errors were encountered:

michaeleisel · 2024-03-15T13:16:56Z

Here's a discussion I've opened for it on cudarc: coreylowman/cudarc#209

xnorpx · 2024-08-11T03:49:49Z

I was also looking into this, looks like cudarc now supports create a device_with_stream have you tested this yet @michaeleisel ?

michaeleisel · 2024-08-11T06:16:34Z

I haven't, but it appears sufficient

LaurentMazare · 2024-10-02T19:32:57Z

Indeed this seems to be sufficient as all cudarc operations now use the appropriate stream based on the cudarc::driver::CudaDevice so I've just merged #2532 which adds a Device::new_cuda_with_stream based on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for CUDA Streams #1751

Support for CUDA Streams #1751

michaeleisel commented Feb 23, 2024

michaeleisel commented Mar 15, 2024

xnorpx commented Aug 11, 2024

michaeleisel commented Aug 11, 2024

LaurentMazare commented Oct 2, 2024

Support for CUDA Streams #1751

Support for CUDA Streams #1751

Comments

michaeleisel commented Feb 23, 2024

michaeleisel commented Mar 15, 2024

xnorpx commented Aug 11, 2024

michaeleisel commented Aug 11, 2024

LaurentMazare commented Oct 2, 2024