-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda is slower than cpu #17
Comments
@dhkdnduq , can you see here 0- Be sure that the model and images are in the GPU |
@dhkdnduq try to reduce the forward pass by cut the particular layers |
@DeepKnowledge1 thanks for replying . I think it's because of openmp. Gpu has little effect on parallelism. |
|
//python
//c++ |
Hello,Can I borrow your LibTorch code about PaDiM,,thank you very much indeed |
gpu : rtx3090
cpu : i5-10400F 6core
with openmp
Most of the remaining codes are the same and only mahalanobis codes are different.
This began because the implementation of project from c++ to libtorch(cuda) was slower than Python's numpy.
from image preprocess to mahalanobis loop
1.opencv(cpu): 1.5~2sec ,
*Gpumat is not yet supported.
2.libtorch(cuda) : 0.4~0.45 sec
3.libtorch(cpu) : 0.25~0.35 sec
4.eigen(cpu):0.2~0.25sec
i hope this helps
The text was updated successfully, but these errors were encountered: