-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is clang(CUDA) + OpenMP not supported? #639
Comments
It was already disabled before this PR. I do not remember all the details but as far as I remember, it is simply not supported by clang to do both at once. |
If you want to find out the reason, you can simply remove those lines from the CMakeLists and create a PR. The CI will tell you the truth. Maybe it is supported now. |
it seems, clang disables OpenMP, when clang(cuda) is involved. So no |
From a quick test,
My suggestion as well. |
I will test with the new versions next week, when our Intel/Nvidia developer system is up. |
Using clang7 and CUDA10.0 I got a known issue:
libdevice is where clang expects it.
However, as you can see, there is only that 1.0 thing. As of CUDA 9 it no more ships different libdevices. See also this forum thread on NVidia. So I installed CUDA9.2 via spack (spack silently has overwritten external system libraries again^^). Then it takes the libdevice, however, the OpenMP and CUDA(clang) back-end still does not play together. OpenMP is somehow disabled by clang when CUDA(clang) is used in conjunction. |
Ok, CUDA 10 is not yet in clang trunk since Friday ;-) |
12 hours ago they pushed the CUDA 10 commit :) |
Yep, I mailed the author :) |
There has been an answer on the linked issue. We may want to reevaluate the CUDA (clang) + OpenMP support for clang 8.0. |
I just tried with clang 8.0 and I get: |
Sorry for not replying earlier. Just can report the last state. clang compilationspack load [email protected]
SOURCE_ROOT=$HOME/sources
LLVM_INSTALL_DIR=$HOME/software/llvm-dev-gcc5/
# get llvm, use different clang source
git clone --single-branch -b patched-upstream https://github.com/clang-ykt/llvm
cd llvm/tools
# rm -r clang # cleanup
# get clang dev
git clone --single-branch -b patched-upstream https://github.com/clang-ykt/clang
cd ..
mkdir -p build
cd build
cmake -DLLVM_BUILD_TESTS=OFF -DLLVM_PARALLEL_COMPILE_JOBS=8 -DLLVM_PARALLEL_LINK_JOBS=8 -DLLVM_BUILD_UTILS=OFF -DCMAKE_INSTALL_PREFIX=$LLVM_INSTALL_DIR -DCMAKE_BUILD_TYPE=RelWithDebInfo ../
make -j 8
make install Alpaka compilation module load [email protected]
module load [email protected]
cmake -DCMAKE_CXX_COMPILER=$HOME/software/llvm-dev-gcc5/bin/clang-9 -DALPAKA_CUDA_COMPILER=clang ..
make axpy Errorsmake axpy develop ⬆ ✭ ◼
[ 0%] Building CXX object test/CMakeFiles/CatchMain.dir/CatchMain.cpp.o
[ 0%] Linking CXX static library libCatchMain.a
[ 0%] Built target CatchMain
[ 50%] Building CXX object test/common/CMakeFiles/common.dir/src/Dummy.cpp.o
[ 50%] Linking CXX static library libcommon.a
[ 50%] Built target common
[100%] Building CXX object test/integ/axpy/CMakeFiles/axpy.dir/src/axpy.cpp.o
In file included from <built-in>:1:
In file included from /home/mwerner/software/llvm-dev-gcc5/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:327:
/opt/spack-modules/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-10.0.130-2ncant54wqot3prhetq4yyqybnynppi4/include/crt/math_functions.hpp:671:10: error: no matching function for call to 'lgammaf'
return lgammaf(a);
^~~~~~~
/usr/include/x86_64-linux-gnu/bits/math-finite.h:319:37: note: candidate function not viable: call to __host__ function from __device__ function
__extern_always_inline float __NTH (lgammaf (float __d))
^
In file included from <built-in>:1:
In file included from /home/mwerner/software/llvm-dev-gcc5/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:327:
/opt/spack-modules/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-10.0.130-2ncant54wqot3prhetq4yyqybnynppi4/include/crt/math_functions.hpp:676:10: error: no matching function for call to 'tgammaf'
return tgammaf(a);
^~~~~~~
/usr/include/x86_64-linux-gnu/bits/math-finite.h:487:37: note: candidate function not viable: call to __host__ function from __device__ function
__extern_always_inline float __NTH (tgammaf (float __d))
^
In file included from /home/mwerner/cuda-workspace/alpaka/test/integ/axpy/src/axpy.cpp:15:
In file included from /home/mwerner/cuda-workspace/alpaka/include/alpaka/alpaka.hpp:22:
In file included from /home/mwerner/cuda-workspace/alpaka/include/alpaka/acc/AccCpuSerial.hpp:38:
/home/mwerner/cuda-workspace/alpaka/include/alpaka/dev/DevCpu.hpp:68:21: error: explicitly defaulted copy constructor is implicitly deleted [-Werror,-Wdefaulted-function-deleted]
DevCpuImpl(DevCpuImpl const &) = default; [...] In file included from /home/mwerner/cuda-workspace/alpaka/test/integ/axpy/src/axpy.cpp:15:
In file included from /home/mwerner/cuda-workspace/alpaka/include/alpaka/alpaka.hpp:29:
In file included from /home/mwerner/cuda-workspace/alpaka/include/alpaka/acc/AccGpuCudaRt.hpp:31:
In file included from /home/mwerner/cuda-workspace/alpaka/include/alpaka/rand/RandCuRand.hpp:27:
In file included from /opt/spack-modules/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-10.0.130-2ncant54wqot3prhetq4yyqybnynppi4/include/curand_kernel.h:70:
/opt/spack-modules/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-10.0.130-2ncant54wqot3prhetq4yyqybnynppi4/include/curand_philox4x32_x.h:81:9: error: macro name is a reserved identifier
[-Werror,-Wreserved-id-macro]
#define CURAND_PHILOX4X32_X__H_
^
/opt/spack-modules/linux-ubuntu16.04-x86_64/gcc-5.4.0/cuda-10.0.130-2ncant54wqot3prhetq4yyqybnynppi4/include/curand_philox4x32_x.h:108:23: error: use of old-style cast
[-Werror,-Wold-style-cast]
unsigned int nlo = (unsigned int)(n); @SimeonEhrig reported that clang-9 with a simple CUDA program was working. I'll talk with him about this issue... |
Thank you for trying this!
|
thanks for the quick response. I will try the new version of clang. |
ok, new version of alpaka and clang helped except for the math stuff, but adding missing functions to the aforementioned header works ... now I just get a glibc linker issue, our Ubuntu 16.04 probably has too old glibc, where
Well, let's call it a day... |
@tdd11235813 |
Thanks for digging into this everyone! @SimeonEhrig can you pls post your @tdd11235813 can you pls link the upstream LLVM bug report regarding the missing define in the |
Here is my source-code and the test-environment: source-code: #include <iostream>
#include <omp.h>
__global__ void foo(float * v, unsigned int start, unsigned int size, float id){
for(unsigned int i = start; i < start+size; ++i){
v[i] = lgammaf(id);
}
}
int main(int argc, char const *argv[])
{
const unsigned int number_of_threads = 14;
const unsigned int vector_size = 4;
float v_host[number_of_threads*vector_size];
float * v_device;
cudaMalloc( (void**) &v_device, sizeof(float)*number_of_threads*vector_size);
// try to generate a race condition to show that OpenMP is works
#pragma omp parallel for
for(int i = 0; i < number_of_threads; ++i)
{
std::cout << "Hello my id is: " << omp_get_thread_num() << std::endl;
}
#pragma omp parallel for
for(unsigned int i = 0; i < number_of_threads; ++i){
foo<<<1, 1>>>(v_device, i*vector_size, vector_size, static_cast<float>(omp_get_thread_num()+1) );
}
cudaMemcpy(v_host, v_device, sizeof(float)*number_of_threads*vector_size, cudaMemcpyDeviceToHost);
for(unsigned int i = 0; i < number_of_threads; ++i){
for(unsigned int j = 0; j < vector_size; ++j){
std::cout << v_host[i*vector_size+j] << " ";
}
std::cout << std::endl;
}
return 0;
} makefile: id_vector : id_vector_omp.cu
clang++ -std=c++11 id_vector_omp.cu -o id_vector --cuda-gpu-arch=sm_30 -L/usr/local/cuda/lib64 -lcudart_static -ldl -lrt -pthread -fopenmp singularity container: Bootstrap: docker
From: nvidia/cuda:10.0-devel-ubuntu18.04
%help
This container is based on the CUDA 10.0 docker container with Ubuntu 18.04 environment: https://hub.docker.com/r/nvidia/cuda/
%setup
%files
%labels
Maintainer Simeon Ehrig
Version 1.0
%environment
export PATH=$PATH:/usr/local/cuda/bin/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
%post
apt update
apt install -y git nano wget gdb cmake make python2.7 zlib1g-dev libomp-dev I tested it with the current dev branch of clang/llvm (9.0-dev c64d73cce240148ea5f38d70f1851373036e716d - https://clang.llvm.org/get_started.html) and clang-ykt (Matthias his instruction). It works with the vanila Clang. The clang-ykt throws following error: clang++ -std=c++11 id_vector_omp.cu -o id_vector --cuda-gpu-arch=sm_30 -L/usr/local/cuda/lib64 -lcudart_static -ldl -lrt -pthread -fopenmp
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:61:59: error: no
matching function for call to '__nv_brev'
__DEVICE__ unsigned int __brev(unsigned int __a) { return __nv_brev(__a); }
^~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:52:16: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ int __nv_brev(int __a);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:63:10: error: no
matching function for call to '__nv_brevll'
return __nv_brevll(__a);
^~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:53:22: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ long long __nv_brevll(long long __a);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:71:10: error: no
matching function for call to '__nv_byte_perm'
return __nv_byte_perm(__a, __b, __c);
^~~~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:54:16: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ int __nv_byte_perm(int __a, int __b, int __c);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:73:40: error: no
matching function for call to '__nv_clz'
__DEVICE__ int __clz(int __a) { return __nv_clz(__a); }
^~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:59:16: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ int __nv_clz(int __a);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:74:48: error: no
matching function for call to '__nv_clzll'
__DEVICE__ int __clzll(long long __a) { return __nv_clzll(__a); }
^~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:60:16: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ int __nv_clzll(long long __a);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:75:45: error: no
matching function for call to '__nv_fast_cosf'
__DEVICE__ float __cosf(float __a) { return __nv_fast_cosf(__a); }
^~~~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:144:18: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ float __nv_fast_cosf(float __a);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:86:10: error: no
matching function for call to '__nv_dadd_rd'
return __nv_dadd_rd(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:73:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dadd_rd(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:89:10: error: no
matching function for call to '__nv_dadd_rn'
return __nv_dadd_rn(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:74:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dadd_rn(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:92:10: error: no
matching function for call to '__nv_dadd_ru'
return __nv_dadd_ru(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:75:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dadd_ru(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:95:10: error: no
matching function for call to '__nv_dadd_rz'
return __nv_dadd_rz(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:76:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dadd_rz(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:98:10: error: no
matching function for call to '__nv_ddiv_rd'
return __nv_ddiv_rd(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:77:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_ddiv_rd(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:101:10: error: no
matching function for call to '__nv_ddiv_rn'
return __nv_ddiv_rn(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:78:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_ddiv_rn(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:104:10: error: no
matching function for call to '__nv_ddiv_ru'
return __nv_ddiv_ru(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:79:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_ddiv_ru(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:107:10: error: no
matching function for call to '__nv_ddiv_rz'
return __nv_ddiv_rz(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:80:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_ddiv_rz(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:110:10: error: no
matching function for call to '__nv_dmul_rd'
return __nv_dmul_rd(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:81:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dmul_rd(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:113:10: error: no
matching function for call to '__nv_dmul_rn'
return __nv_dmul_rn(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:82:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dmul_rn(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:116:10: error: no
matching function for call to '__nv_dmul_ru'
return __nv_dmul_ru(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:83:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dmul_ru(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:119:10: error: no
matching function for call to '__nv_dmul_rz'
return __nv_dmul_rz(__a, __b);
^~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:84:19: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ double __nv_dmul_rz(double __a, double __b);
^
In file included from <built-in>:1:
In file included from /home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_runtime_wrapper.h:162:
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_device_functions.h:122:10: error: no
matching function for call to '__nv_double2float_rd'
return __nv_double2float_rd(__a);
^~~~~~~~~~~~~~~~~~~~
/home/sehrig/projects/clang-ykt/install/lib/clang/9.0.0/include/__clang_cuda_libdevice_declares.h:85:18: note:
candidate function not viable: call to __host__ function from __device__ function
__DEVICE__ float __nv_double2float_rd(double __a);
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for host.
makefile:2: recipe for target 'id_vector' failed
make: *** [id_vector] Error 1 |
so you did not get the lgammaf error or did you fix the clang cuda header before as well? |
No, I didn't fix the header before. I built the clang/llvm from the original Github repo and compiled the source code with the Makefile you see in the previous comment. What is special about clang-ykt? I know the project as fork of cling, which implemented the OpenMP 4.0 GPU offloading for CUDA. But this functionality was merged in the upstream. So, what is the special purpose now? |
I can't remember, and if upstream is working, then upstream is fine. I'll test the whole thing with upstream's version again. |
current WIP: diff --git a/alpakaConfig.cmake b/alpakaConfig.cmake
index 9bd4732..0d2d0dc 100644
--- a/alpakaConfig.cmake
+++ b/alpakaConfig.cmake
@@ -864,6 +864,12 @@ ELSE()
ELSEIF(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
LIST(APPEND _ALPAKA_COMPILE_OPTIONS_PUBLIC "-ftemplate-depth=512")
ENDIF()
+
+ # clang 9.0 fails at linker level due to missing libs
+ IF(CMAKE_CXX_COMPILER_ID MATCHES "Clang" AND NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 9.0)
+ LIST(APPEND _ALPAKA_LINK_LIBRARIES_PUBLIC "-lstdc++;m")
+ ENDIF()
+
ENDIF()
#------------------------------------------------------------------------------- However, runtime GPU tests fail now. /home/mwerner/cuda-workspace/alpaka/test/integ/axpy/src/axpy.cpp:221: FAILED:
{Unknown expression after the reported line}
due to unexpected exception with message:
/home/mwerner/cuda-workspace/alpaka/include/alpaka/queue/QueueCudaRtAsync.hpp
(340) 'cudaStreamSynchronize_ptsz( queue.m_spQueueImpl->m_CudaQueue)' A
previous CUDA call (not this one) set the error :
'cudaErrorLaunchOutOfResources': 'too many resources requested for launch'!
===============================================================================
test cases: 1 | 1 failed
assertions: 3 | 2 passed | 1 failed Tested with CUDA 10.0.130 and CUDA 9.2.88 (requires Werror to be disabled due to philox/curand conversion warnings) (with and without CPU OpenMP backends). PS: spack load [email protected]
SOURCE_ROOT=$HOME/sources
LLVM_INSTALL_DIR=$HOME/software/llvm-dev-gcc5/
cd $SOURCE_ROOT
git clone https://github.com/llvm/llvm-project.git llvm
cd llvm
mkdir -p build
cd build
cmake -DLLVM_ENABLE_PROJECTS=clang -G "Unix Makefiles" -DLLVM_BUILD_TESTS=OFF -DLLVM_PARALLEL_COMPILE_JOBS=12 -DLLVM_PARALLEL_LINK_JOBS=12 -DLLVM_BUILD_UTILS=OFF -DCMAKE_INSTALL_PREFIX=$LLVM_INSTALL_DIR -DCMAKE_BUILD_TYPE=RelWithDebInfo ../llvm
make -j 12
make install |
current WIP (clang-cuda with upstream clang):
Resource UsageResource usage of axpy Alpaka test. clang-9, Undefined
|
The bug still exists with |
@SimeonEhrig Do you know whether this is still a problem? |
My example is compiling and running with |
PR #2099 allows to create the test cases. But it needs some extra work to enable the combination, such like removing this lines of code in the CMakeLists.txt: alpaka/cmake/alpakaCommon.cmake Line 389 in 5a889b5
|
what was the reason for disabling CUDA(clang) + OpenMP? Does not clang support this or was this alpaka related?
( corresponding PR )
The text was updated successfully, but these errors were encountered: