wip: work so far on memory and module api #41

asalkeld · 2023-02-02T23:59:22Z

No description provided.

rbtcollins

Left a few thoughts, haven't read the whole thing though :) Hope they are helpful.

rbtcollins · 2023-02-27T12:40:49Z

sdac-lib/Cargo.toml

@@ -6,10 +6,14 @@ license = "MIT"
 description = "Software Defined Acclerated Compute"
 homepage = "https://github.com/xertai/sdac"
 edition = "2021"
+build = "build-cuda-types.rs"


Hi, hope you don't mind some random comments since I've landed here with curiosity from Steve's latest post...

This to me is a sign that you probably want a sub-crate, that can form a hermetic boundary around your code generation needs. build files can be a bit tricky - one common failure mode I see is people who end up rebuilding everything every build because they haven't annotated things correctly, and it can help a lot to have that isolated to a small crate.

rbtcollins · 2023-02-27T12:44:55Z

sdac-lib/build-cuda-types.rs

+// Install NVIDIA CUDA prior to building the bindings with `cargo build`.
+// https://docs.rs/bindgen/latest/bindgen/struct.Builder.html
+fn main() {
+    let cdir = std::env::var("CUDA_DIR").unwrap_or("/usr/local/cuda-11.8".to_string());


/usr/local/cuda-11.8 is going to change over time and over OS; see for instance https://askubuntu.com/questions/1375718/no-usr-local-cuda-directory-after-cuda-installation

I think you're going to want a helper function that probes some N places to find the actual path. Some folk working on build systems are also trying to make sure all contributing state to a build can be materialized to disk, so perhaps having it in a config file or some such would help too.

rbtcollins · 2023-02-27T12:46:25Z

sdac-lib/build-cuda-types.rs

+        .derive_eq(true)
+        .array_pointers_in_arguments(true)
+        .generate()
+        .unwrap();


You're going to want to add CargoCallbacks here, otherwise cargo won't know when to rebuild your bindings. https://docs.rs/bindgen/latest/bindgen/struct.CargoCallbacks.html

rbtcollins · 2023-02-27T12:52:25Z

sdac-lib/src/device.rs

+use std::sync::{Mutex};
+use tarpc::{context};
+
+pub fn cuGetErrorString(


Are these non (idiomatic rust names)[https://rust-lang.github.io/api-guidelines/naming.html] required by some external standard? If not, I really encourage not using them.

Can I read mixedCase ? yes; but everywhere you interoperate with idiomatic code it will add friction and visual dissonance.

If they are from an external standard, I suggest a trivial trait + impl that thunks all the mixedCase names to snake_case, isolating it in one place in your codebase; then use snake_case everywhere else.

rbtcollins · 2023-02-27T12:54:37Z

sdac-lib/src/device.rs

+use std::mem::size_of;
+use std::ffi::CString;
+use std::sync::{Mutex};
+use tarpc::{context};


there's a very nice import ordering that isn't quite popular enough to be put into rustfmt; but rust-analyzer supports: grouping by

standard

external crates

in-repo crates

this crate

Here this would look like

use std::ffi::CString; use std::sync::Mutex; use std::mem::size_of; use futures::executor::block_on; use tarpc::context; use service::*;

rbtcollins · 2023-02-27T13:00:35Z

sdac-lib/src/device.rs

+    error: CUresult,
+    pStr: *mut ::std::os::raw::c_char,
+) -> CUresult {
+    let (strName, res) = block_on(


for this case, there's no need to block_on:

remove the block_on, use .await at the end of .cuGetErrorString. tarpc should be giving you threadsafe behaviour - not pointing to a transient memory space. Though the use of a pointer across an RPC is really hard to reason about.

Concrete suggestion for improvement: don't do this across the RPC barrier: Construct the error object within the RPC server, not on the calling side.

Can I commend to you https://github.com/Rust-GPU/Rust-CUDA/blob/8a6cb734d21d5582052fa5b38089d1aa0f4d582f/crates/cust/src/error.rs#L98 as an alternative? That is, use the existing crate. I'm sure there's some reason you're not, but its not obvious to me.

rbtcollins · 2023-02-27T13:18:21Z

sdac-lib/src/device.rs

+        client
+            .lock()
+            .unwrap()
+            .cuGetErrorName(context::current(), error),


same comments here; but also - combining both these RPCs into one slightly fatter call is probably an optimisation: in the success case you can have a Result<...>::Ok(), and in the error case grab the string at the same time.

rbtcollins · 2023-02-27T13:24:12Z

sdac-lib/src/device.rs

+
+    unsafe {
+        libc::memcpy(dstHost, data.as_ptr() as *const libc::c_void, ByteCount as usize);
+    }


I worry about the amount of unsafe - another plug for both doing the reification to safe constructs close to the actual CUDA calls, and for reusing the existing rust cuda bindings; I think they'll be very helpful

rbtcollins · 2023-02-27T13:25:40Z

sdac-lib/src/global.rs

+    // giving out a read-only borrow here is safe because it is guaranteed no more mutable
+    // references will exist at this point or in the future.
+    unsafe { CUDA_CLIENT.as_ref().unwrap() }
+}


I commend to you the once_cell crate. It will make this safer - literally, allow removal of unsafe{} - and a bit smaller too.

rbtcollins · 2023-02-27T13:43:12Z

sdac-server/src/main.rs

+
+            let str = CString::from_raw(name)
+                .into_string()
+                .expect("failed to convert name");


If you start using Result<> you'll be able to avoid having latent panics in your program.

If you define some sugar types you can also make things feel much more idiomatic; I don't know if thats compatible with your broader goals though.

But wouldn't it be nice to have something like

impl Display for CudaError { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { let mut name : MaybeUninit<*const c_char>= MaybeUninit::uninit(); unsafe { // A sys style binding that returns a Result<> cuGetErrorName(self.error, name.as_mut_ptr())?; } // borrowed let str = &unsafe {CStr::from_ptr(name.assume_init() )}; write!(f, "{}", str.to_str()?)?; } }

And then you can use Result<T, CudaError> rather than having any awareness of the CUDA error system

asalkeld force-pushed the mem-and-module branch from a1e3cce to 6947802 Compare February 3, 2023 05:50

temp: work so far on memory and module api

6947802

rbtcollins reviewed Feb 27, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wip: work so far on memory and module api #41

wip: work so far on memory and module api #41

asalkeld commented Feb 2, 2023

rbtcollins left a comment

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

rbtcollins Feb 27, 2023

wip: work so far on memory and module api #41

Are you sure you want to change the base?

wip: work so far on memory and module api #41

Conversation

asalkeld commented Feb 2, 2023

rbtcollins left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment