-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiling on K80, executing on P100 #233
Comments
You cannot compile code for one GPU architecture and run it on a different GPU architecture. |
I don't think I understand. I'm working in a cluster, where several GPU machines are connected to and can be used. In some of the machines there are K80 GPUs, on others P100, |
It should work if you set |
according to the log, I already build with sm_70 which V100 needed. but why I can not run on V100 ? |
Could you show according to which log? |
arch=compute_70,code=sm_70;-gencode
|
As the log says this is what pytorch is built with. This may not be what detectron2 is built with. |
sorry, my falut. That should be solved by |
Hi,
I have installed and compiled detectron2 on K80 GPU under a conda environment.
Training on K80 GPU works fine, on several machines.
While training on P100 GPU I'have got the following error:
File "/net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/layers/roi_align.py", line 95, in forward
input, rois, self.output_size, self.spatial_scale, self.sampling_ratio, self.aligned
File "/net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/layers/roi_align.py", line 20, in forward
input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio, aligned
RuntimeError: CUDA error: no kernel image is available for execution on the device (ROIAlign_forward_cuda at /net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/layers/csrc/ROIAlign/ROIAlign_cuda.cu:361)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x2af4cfc99687 in /home/data/Software/Anaconda3_2018/envs/detectron2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: detectron2::ROIAlign_forward_cuda(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0x9f7 (0x2af4eab00151 in /net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #2: detectron2::ROIAlign_forward(at::Tensor const&, at::Tensor const&, float, int, int, int, bool) + 0x9c (0x2af4eaacb1bc in /net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #3: + 0x5a00f (0x2af4eaadc00f in /net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #4: + 0x541ef (0x2af4eaad61ef in /net/mraid11/export/data/Projects/detectron2/detectron2_repo/detectron2/_C.cpython-37m-x86_64-linux-gnu.so)
frame #9: THPFunction_apply(_object*, _object*) + 0x8d6 (0x2af4a1ff9e96 in /home/data/Software/Anaconda3_2018/envs/detectron2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
To Reproduce
install and run detectron2 on a K80 GPU, then run on a P100 GPU.
My installation process:
conda create --name detectron2
conda activate detectron2
conda install ipython
pip install ninja yacs cython matplotlib tqdm opencv-python
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
cd ~/data/Projects/detectron2/
pip install git+https://github.com/facebookresearch/fvcore.git
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
git clone https://github.com/facebookresearch/detectron2 detectron2_repo
pip install -e detectron2_repo
cd detectron2_repo/datasets/
mkdir -p coco
ln -s /home/data/Datasets/coco/annotations coco/annotations
ln -s /home/data/Datasets/coco/train2017 coco/train2017
ln -s /home/data/Datasets/coco/val2017 coco/val2017
ln -s /home/data/Datasets/coco/test2017 coco/test2017
python tools/train_net.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
Expected behavior
If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.
Environment
sys.platform linux
Python 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Numpy 1.17.2
Detectron2 Compiler GCC 4.9
Detectron2 CUDA Compiler 9.2
DETECTRON2_ENV_MODULE
PyTorch 1.3.0
PyTorch Debug Build False
torchvision 0.4.1a0+d94043a
CUDA available True
GPU 0,1,2,3 Tesla P100-PCIE-16GB
CUDA_HOME /usr/local/cuda-9.2
NVCC Cuda compilation tools, release 9.2, V9.2.88
Pillow 6.2.0
cv2 4.1.1
PyTorch built with:
The text was updated successfully, but these errors were encountered: