You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some quick research shows that flatbuffers allow for custom memory allocators. So we could allocate a flatbuffer using Cuda unified memory. Since we run in threads, once the memory is on the GPU, any block in the same flowgraph would be able to access it on the GPU (which makes for some really cool gpu processing flows. )
Just note that if we were to transfer the data between processes, we would have to pull the data of the GPU, serialize, deserialize and then push it back to the GPU. This could lead to confusingly slow processing in cases where you are using multiple flowgraphs connected over zmq.
mormj
pushed a commit
to mormj/pmt
that referenced
this issue
Nov 17, 2022
Can the underlying memory structure be in GPU device memory so that packets of data residing in GPU memory can be passed around like a thrust vector?
The text was updated successfully, but these errors were encountered: