Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker: gpu is not detected #20

Open
jalpianissimo opened this issue Jul 1, 2024 · 1 comment
Open

Docker: gpu is not detected #20

jalpianissimo opened this issue Jul 1, 2024 · 1 comment

Comments

@jalpianissimo
Copy link

Hi, I've wanted to use your program with docker on Linux but I had the following problems:

Issue 1:

Following PaddleOCR environment setup instructions I used the paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 image, then I installed paddlepaddle-gpu, paddleocr and videocr-PaddleOCR. So I tried to run videocr like the Colab example, getting the following error:

ImportError: cannot import name 'shadow_var_between_sub_programs' from 'paddle.distributed.passes.pass_utils'

I fixed it by downloading the latest pass_utils.py from PaddleOCR repo, this way everything worked.

Issue 2:

Afterward I noticed that the OCR was performed via CPU, as nvidia-smi did not show active GPU usage and running the following returned False:

import paddle
gpu_available  = paddle.device.is_compiled_with_cuda()
print("GPU available:", gpu_available)

As I already installed the Nvidia Container Toolkit I tried to start a new container with the same image, to understand if the problem was within the image or from something else. So I did:

docker stop ppocr
docker container remove ppocr
docker image -a
docker image rm [ID]
sudo docker run --gpus all --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 /bin/bash

I immediately tried to check for GPU usage with the python snippet above and it returned True. Next I installed videocr-PaddleOCR and checked again, this time it returned False.
Then I tried installing videocr-PaddleOCR on a newer docker image pulled from the Hub (paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6) and repeated the steps above, so I checked GPU after starting the container, after installing paddlepaddle-gpu, and after installing videocr, having the same results as before (but this time no ImportError) --> so running paddleocr alone works on GPU, after installing videocr it does not anymore...

As I do understand very little of everything programming-related, my solution (in order to have videocr on gpu) is as follows:

  1. Start a docker container with paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6 as image
  2. Clone this repo git clone https://github.com/devmaxxing/videocr-PaddleOCR and edit the requirements.txt file so that only includes:
paddlepaddle-gpu
paddleocr==2.7.0.2
charset-normalizer==3.2.0
colorama==0.4.6
Levenshtein==0.21.1
paddle-bfloat==0.1.7
python-Levenshtein==0.21.1
PyWavelets==1.4.1
thefuzz==0.19.0
  1. Install python -m pip install . and run! Now it uses gpu (as running the first snippet returns True even after installing videocr).

Note:

I reached this conclusion by chance, but if I have to give a reasoning behind is the presence of paddlepaddle in the original requirements.txt file, as I noted paddlepaddle-gpu and paddlepaddle together return gpu usage as False regardless of docker image.
To be sure I only included all the missing dependencies: (i.e. on the clean docker image i installed paddlepaddle-gpu, then pip freeze and cross-checked to get everything that was missing from the original requirements.txt), and added paddleocr=2.7.0.2 due to #16 as using the latest paddleocr I encountered the same issue.

Note-bis:

Here is the code I used (as taken from Colab) to test videocr

from videocr import save_subtitles_to_file

#@title OCR parameters
input_file_path = "/home/test.mp4" 
output_file_path = "/home/out.srt" 
language_code = "ch" 
use_gpu = True 
start_time = "0:00" 
end_time = "" 
confidence_threshold = 75 
similarity_threshold = 80 
frames_to_skip = 0 
crop_x = None 
crop_y = None 
crop_width = None 
crop_height = None 

save_subtitles_to_file(input_file_path, output_file_path, lang=language_code,
                       time_start=start_time, time_end=end_time,
                       conf_threshold=confidence_threshold, sim_threshold=similarity_threshold,
                       use_gpu=use_gpu,
                       frames_to_skip=frames_to_skip,
                       crop_x=crop_x, crop_y=crop_y, crop_width=crop_width, crop_height=crop_height)

Note-last:

I wanted to thank you for this program as it helped me a lot, I wanted to share my experience as I lost some good few hours but now seem to be fixed.

@devmaxxing
Copy link
Owner

Thanks for the heads up. For issue 1, I tested the GPU setup in Google Colab which uses paddlepaddle-gpu 2.5.1 from https://mirror.baidu.com/pypi/simple and did not have this issue. Do you have the exact paddlepaddle-gpu and paddleocr versions that was used when you encountered the pass_utils issue?

For issue 2, I have removed the paddlepaddle requirement from the setup.py script and renamed the requirements.txt file to requirements_cpu.txt to make it more clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants