Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider oneDNN instead of MKL for SGEMM #706

Open
kpu opened this issue Aug 31, 2020 · 15 comments · May be fixed by #967
Open

Consider oneDNN instead of MKL for SGEMM #706

kpu opened this issue Aug 31, 2020 · 15 comments · May be fixed by #967

Comments

@kpu
Copy link
Member

kpu commented Aug 31, 2020

https://github.com/oneapi-src/oneDNN/ aka MKLDNN aka DNNL now has better performance for MT-size matrices: apache/mxnet#17980 . And it's open source. The same teams write the GEMM for MKL and oneDNN.

Would be worth benchmarking.

@emjotde
Copy link
Member

emjotde commented Aug 31, 2020

Cool, should be easy to check?

@emjotde
Copy link
Member

emjotde commented Aug 31, 2020

Is the expectation of better performance for any arch or AVX512 specific?

@kpu
Copy link
Member Author

kpu commented Aug 31, 2020

I've only bothered to measure AVX512 but we should check. Paging @sidkashyap.

@XapaJIaMnu
Copy link
Contributor

@sidkashyap-at-Intel
Copy link

sidkashyap-at-Intel commented Nov 18, 2020

OneDNN v1.7 improves the performance for older architectures too, including SSE4.1 for int8
https://github.com/oneapi-src/oneDNN/releases/tag/v1.7

We provided the Matrix Multiplications ranks from Marian� Inference for oneDNN to be optimized, the latest version includes those optimizations.

@XapaJIaMnu
Copy link
Contributor

XapaJIaMnu commented Nov 18, 2020

I have a branch with oneDNN. (You also need to disable cblas_sgemm_batched, which i forgot to do) https://github.com/XapaJIaMnu/marian-dev/tree/oneDNN

We need banchmarks to show that it's not slow. Unfortunately, there isn't much incentive to switch to oneDNN completely, as we still need MKL (or some sort of BLAS), because of FAISS requiring things like undefined reference to sorgqr_` @sidkashyap-at-Intel can we get a word to intel people to include some of those basic BLAS routines inside oneDNN?

@sidkashyap-at-Intel
Copy link

Hey @XapaJIaMnu, do we have a priority list of functions in MKL that need to be oneDNN? I will work with the oneDNN team to sort that out if possible.

@XapaJIaMnu
Copy link
Contributor

@sidkashyap-at-Intel

../libmarian.a(VectorTransform.cpp.o): In function `(anonymous namespace)::eig(unsigned long, double*, double*, int)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:428: undefined reference to `dsyev_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:433: undefined reference to `dsyev_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::transform_transpose(long, float const*, float*) const':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:291: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `matrix_qr(int, int, float*)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:98: undefined reference to `sgeqrf_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:103: undefined reference to `sgeqrf_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:106: undefined reference to `sorgqr_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::apply_noalloc(long, float const*, float*) const':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:266: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::LinearTransform::set_is_orthonormal()':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:317: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::PCAMatrix::prepare_Ab()':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:710: undefined reference to `sgemm_'
../libmarian.a(VectorTransform.cpp.o): In function `faiss::PCAMatrix::train(long, float const*)':
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:559: undefined reference to `ssyrk_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:597: undefined reference to `sgemm_'
/home/nbogoych/marian-dev-tst/src/3rd_party/faiss/VectorTransform.cpp:518: undefined reference to `ssyrk_'
collect2: error: ld returned 1 exit status

Basically, FAISS dependencies.

Cheers,

Nick

@sidkashyap-at-Intel
Copy link

Thank you, this will help in getting the request quantified. Will update on the progress soon.

@sidkashyap-at-Intel
Copy link

Had an internal discussion with @vpirogov from the oneDNN team, unfortunately the support for FAISS MKL dependencies cannot be addressed in oneDNN as it is outside the Deep Learning remit that the library focuses on.

@emjotde
Copy link
Member

emjotde commented Jan 12, 2021

Hi, we need the FAISS support internally, but we can make it depend on finding MKL only?

@ykim362
Copy link
Member

ykim362 commented Jan 12, 2021

Had an internal discussion with @vpirogov from the oneDNN team, unfortunately the support for FAISS MKL dependencies cannot be addressed in oneDNN as it is outside the Deep Learning remit that the library focuses on.

It's used in k-NN MT (https://arxiv.org/pdf/2010.00710.pdf) as well. I see several of those hash, search based methods in DL these days.

@emjotde
Copy link
Member

emjotde commented Jan 12, 2021

@ykim362 very good point! All DNN with retrieval methods would rely on it.

@vpirogov
Copy link

@ykim362, @emjotde,

oneDNN is focused on deep learning algorithms. oneAPI has specialized data analytics library, oneDAL, that supports kNN and other machine learning algorithms.

@kpu
Copy link
Member Author

kpu commented Feb 16, 2022

MKL is blocking Wikipedia from deploying Marian because it is closed source.

@graemenail graemenail mentioned this issue Jun 3, 2022
4 tasks
@graemenail graemenail linked a pull request Sep 14, 2022 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants