A CUDA backend for Petalisp. The project is in an early experimental phase.
(quicklisp:quickload :petalisp-cuda)
;; use petalisp-cuda as petalisp:*backend*
(petalisp-cuda:use-cuda-backend)
;; when you want to the petalisp-cuda for a specific code section
;; with potential re-use of already allocated CUDA resources
(petalisp-cuda:with-cuda-backend
...)
;; with-cuda-backend-raii will free all CUDA resources
;; and destroy the backend after the calculations
(petalisp-cuda:with-cuda-backend-raii
...)
- CUDA toolkit
- CUDNN (optional)
With quicklisp installed, clone this repository to your local projects folder:
git clone [email protected]:theHamsta/petalisp-cuda.git
It is recommended to use my fork of cl-cuda that performs disk caching to ensure that the same kernel is not compiled multiple times.
git clone [email protected]:theHamsta/cl-cuda.git
- [x] sort indices for fasted dimensions
- [ ] hash CUDA array strides to recompile correctly with arrays not allocated in C-layout with alternative memory layouts
- [x] fix super-slow lisp->native, native->lisp calls (for Lisp arrays of element-type single-float/double-float)
- [x] compile kernels not only for fixed iteration spaces
- [x] implement reductions with CUDNN (TODO: automatic integration into compute graph)
- [x] implement convolutions with CUDNN (TODO: automatic integration into compute graph)
- [ ] __restrict__ kernel parameters
- [x] infer function parameters for generated __device__ functions
- [x] Load scalars over __constant__ memory when loading from GPU RAM instead of Host RAM (transfered as kernel arguments)
- [ ] fast interpolation using textures (?)
- [ ] half float, bfloat support
- [ ] optimized transposed memory accesses via cub.h
- [ ] optimized memory accesses via stride tricks
- [ ] static scheduler
- [x] use cudartc library for
fasterJIT compilation
You can activate logging in CUDNN by setting
export CUDNN_LOGDEST_DBG=stdout
export CUDNN_LOGINFO_DBG=1
or stderr
or filename.txt
cl-cuda will print a lot of stuff to stdout.
This behavior can be controlled with cl-cuda:*show-messages*
.
- Stephan Seitz ([email protected])
Copyright (c) 2020 Stephan Seitz ([email protected])
Licensed under the GPLv3 License.