Skip to content

Latest commit

 

History

History
119 lines (96 loc) · 4.21 KB

README.md

File metadata and controls

119 lines (96 loc) · 4.21 KB

cuTimeWarp

CUDA C++ implementations of Dynamic Time Warping and SoftDTW loss function for time series machine learning.

Based on algorithms described in:

Building

This project uses a Makefile to coordinate separate compilation of CUDA kernels and C++ code and is tested on Ubuntu Linux. Typing make will list the available commands:

$ make

Available rules:

build               Build binaries
clean               Delete binaries
fmt                 Format the code with clang-format
plot                Run python script to generate plots
report              Compile the PDF report
run                 Run experiments
run_multi           Run multi-distance experiments
test                Build and run unit tests

To compile the kernels and the test programs, use the make build command.

All C++ / CUDA source code is found in the src/ folder.

Library Dependencies

In addition to depending on the CUDA runtime and cuBLAS (tested with CUDA 11.2), the programs link to BLAS for the CPU implementations, so a version of this library such as (e.g. OpenBLAS) must be available on the machine.

Running

The three programs to use for running comparative performance experiments are:

  • bin/soft_dtw_perf_cpu for timing CPU performance
  • bin/soft_dtw_perf_multi for timing GPU performance
  • bin/soft_dtw_perf_tiled for timing the tiled kernel on GPU (for long time series > 1024)

The programs accept as arguments either a filename containing space-delimited data (see data/ECG200/ECG200_ALL.txt) or the word random and a time series length and count. The program will compute the Soft-DTW dissimilarity between all pairs of time series in the batch and then print output in four columns:

  • Kernel function name
  • The input time series length (number of columns per row)
  • The input time series count (number of rows)
  • The execution time in microseconds

Example:

$ ./bin/soft_dtw_perf_multi
Usage: ./bin/soft_dtw_perf_multi [INPUT_FILENAME] | random [length] [count]

$  ./bin/soft_dtw_perf_multi ./data/ECG200/ECG200_ALL.txt
Data file ./data/ECG200/ECG200_ALL.txt contains 200 time series of length 96
sq_euclid_dist_multi 96 200 515037
softdtw_cuda_naive_multi 96 200 264987
softdtw_cuda_naive_multi_bw_80 96 200 235089
softdtw_cuda_naive_multi_bw_60 96 200 168621
softdtw_cuda_naive_multi_bw_40 96 200 83501
softdtw_cuda_naive_multi_bw_20 96 200 51338
softdtw_cuda_stencil_multi 96 200 100990
softdtw_cuda_stencil_multi_80 96 200 100408
softdtw_cuda_stencil_multi_60 96 200 100844
softdtw_cuda_stencil_multi_40 96 200 101215
softdtw_cuda_stencil_multi_40 96 200 100436
softdtw_cuda_stencil_multi_20 96 200 100647
convert_diagonal_multi 96 200 332664
softdtw_cuda_diagonal_multi 96 200 149158

$ ./bin/soft_dtw_perf_multi random 100 100
sq_euclid_dist_multi 100 100 335883
softdtw_cuda_naive_multi 100 100 61576
softdtw_cuda_naive_multi_bw_80 100 100 52272
softdtw_cuda_naive_multi_bw_60 100 100 32211
softdtw_cuda_naive_multi_bw_40 100 100 18919
softdtw_cuda_naive_multi_bw_20 100 100 18725
softdtw_cuda_stencil_multi 100 100 26558
softdtw_cuda_stencil_multi_80 100 100 25803
softdtw_cuda_stencil_multi_60 100 100 31000
softdtw_cuda_stencil_multi_40 100 100 26120
softdtw_cuda_stencil_multi_40 100 100 25804
softdtw_cuda_stencil_multi_20 100 100 30992
convert_diagonal_multi 100 100 87427
softdtw_cuda_diagonal_multi 100 100 43893

TODO List

  • Implement naive DTW on CPU
  • Implement soft DTW on CPU
  • Choose benchmarking datasets
  • Implement pairwise squared Euclidean distance on CPU
  • Implement soft DTW gradient on CPU
  • Implement soft DTW barycenter estimation on CPU
  • Implement naive soft DTW in CUDA
  • Implement pairwise squared Euclidean distance in CUDA
  • Implement soft DTW gradient in CUDA
  • Implement soft DTW barycenter estimation in CUDA
  • Tiling
  • Shared memory stencil
  • Sakoe-Chiba bands
  • Contiguous diagonal-major array storage layout
  • Run benchmark experiments
  • Analysis of experiment results