Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Topk refactor #770

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open

Topk refactor #770

wants to merge 28 commits into from

Conversation

rhenry-nv
Copy link
Contributor

@rhenry-nv rhenry-nv commented Dec 4, 2020

Description

This PR rewrites the topk implementation. It is an optimization to benefit large batch and does not hinder small batch performance. This PR also changes the topk operator in topk.cu to use the new implementation.

There is a temporary template parameter in the topk launcher to enable/disable indexing relative to a given row. This can be removed if nth_element.cu is removed.

Performance improvements relative to PR #768

Times with 1 stream

Batch Initial Time (s) Current Time(s) % Runtime reduction Speedup factor
1 166.74 166.392 0.002087082 1.002091447
2 114.46 113.005 0.012711864 1.012875536
4 69.39 69.5436 -0.002213575 0.997791314
8 41.52 40.7636 0.018217726 1.01855577
16 25.07 23.9823 0.043386518 1.045354282
32 15.98 14.7954 0.074130163 1.080065426
64 10.57 9.6 0.091769158 1.101041667
128 7.15 6.48 0.093706294 1.103395062
256 5.39 4.65 0.13729128 1.159139785

Times with 2 streams

Batch Initial Time (s) Current Time(s) % Runtime reduction Speedup factor
1 119.03 116.7 0.019574897 1.019965724
2 79.63 78.46 0.014692955 1.014912057
4 48.64 47.83 0.016652961 1.016934978
8 29.07 28.107 0.033126935 1.034261928
16 17.29 16.69 0.03470214 1.03594967
32 10.98 10.24 0.067395264 1.072265625
64 7.19 6.57 0.086230876 1.094368341
128 5.18 4.65 0.102316602 1.113978495
256 4.06 3.47 0.145320197 1.170028818

List of changes:

  • Introduces cub as a dependency
  • Reimplements the topK kernel and replaces the call in nth_element.cu and topk.cu

Added dependencies: cub

How to test

Run the unit tests. There are unit tests that cover the topk operator and they all still passed.

CMake command: cmake .. -DCOMPILE_CPU=on -DCOMPILE_CUDA=on -DUSE_SENTENCEPIECE=on -DUSE_STATIC_LIBS=off -DCOMPILE_SERVER=off -DUSE_FBGEMM=on -DCOMPILE_CUDA_SM35=off -DCOMPILE_CUDA_SM50=off -DCOMPILE_CUDA_SM60=off -DCOMPILE_CUDA_SM70=on -DCOMPILE_CUDA_SM75=off -DCOMPILE_TESTS=on

Ubuntu - 18.04.3 LTS
nvcc - 10.1.243
gcc - 7.5.0

Checklist

  • I have tested the code manually
  • I have run regression tests
  • I have read and followed CONTRIBUTING.md
  • I have updated CHANGELOG.md

Copy link
Contributor

@frankseide frankseide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I like this work very much.

My main general comment is that it would be very helpful to have more high-level comments, for all functions and data structures, as well as high-level blocks inside functions, to state what it does and what it is for.

I understand that some people do not like comments, as it makes the code look less clean. However, high-level comments like this are a service to the reader of the code, since reading a 1 or 2 line comment will be much faster than having to parse and understand the logic of the code itself. Consider that most of the time, readers of the code do not actually care or need to understand the detail implementation of everything. Often, readers need to quickly orient themselves to find the one place that they need to look at.

Thanks!

src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Show resolved Hide resolved
src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Outdated Show resolved Hide resolved
src/3rd_party/topk.cuh Show resolved Hide resolved
src/translator/nth_element.cu Show resolved Hide resolved
src/translator/nth_element.cu Show resolved Hide resolved
@rhenry-nv
Copy link
Contributor Author

I think I have addressed all of your feedback @frankseide.

I added detailed comments about the functions and made the stylistic changes you suggested. Let me if there's anything else that you think needs to be changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants