petalisp-cuda

A CUDA backend for Petalisp. The project is in an early experimental phase.

Usage

(quicklisp:quickload :petalisp-cuda)

;; use petalisp-cuda as petalisp:*backend*
(petalisp-cuda:use-cuda-backend)

;; when you want to the petalisp-cuda for a specific code section
;; with potential re-use of already allocated CUDA resources
(petalisp-cuda:with-cuda-backend
    ...)

;; with-cuda-backend-raii will free all CUDA resources
;; and destroy the backend after the calculations
(petalisp-cuda:with-cuda-backend-raii
    ...)

Requirements

CUDA toolkit
CUDNN (optional)

Installation

With quicklisp installed, clone this repository to your local projects folder:

git clone [email protected]:theHamsta/petalisp-cuda.git

It is recommended to use my fork of cl-cuda that performs disk caching to ensure that the same kernel is not compiled multiple times.

git clone [email protected]:theHamsta/cl-cuda.git

TODOs

[x] sort indices for fasted dimensions
[ ] hash CUDA array strides to recompile correctly with arrays not allocated in C-layout with alternative memory layouts
[x] fix super-slow lisp->native, native->lisp calls (for Lisp arrays of element-type single-float/double-float)
[x] compile kernels not only for fixed iteration spaces
[x] implement reductions with CUDNN (TODO: automatic integration into compute graph)
[x] implement convolutions with CUDNN (TODO: automatic integration into compute graph)
[ ] __restrict__ kernel parameters
[x] infer function parameters for generated __device__ functions
[x] Load scalars over __constant__ memory when loading from GPU RAM instead of Host RAM (transfered as kernel arguments)
[ ] fast interpolation using textures (?)
[ ] half float, bfloat support
[ ] optimized transposed memory accesses via cub.h
[ ] optimized memory accesses via stride tricks
[ ] static scheduler
[x] use cudartc library for ~~faster~~ JIT compilation

Debugging

You can activate logging in CUDNN by setting

export CUDNN_LOGDEST_DBG=stdout
export CUDNN_LOGINFO_DBG=1

or stderr or filename.txt

cl-cuda will print a lot of stuff to stdout. This behavior can be controlled with cl-cuda:*show-messages*.

Name		Name	Last commit message	Last commit date
Latest commit History 332 Commits
src		src
tests		tests
.gitignore		.gitignore
.travis-ci.yml		.travis-ci.yml
README.org		README.org
nsys-tests.sh		nsys-tests.sh
petalisp-cuda.asd		petalisp-cuda.asd
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

petalisp-cuda

Usage

Requirements

Installation

TODOs

Debugging

Author

Copyright

License

About

Releases

Packages

Languages

theHamsta/petalisp-cuda

Folders and files

Latest commit

History

Repository files navigation

petalisp-cuda

Usage

Requirements

Installation

TODOs

Debugging

Author

Copyright

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages