Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3rdparty, document] Updated Documentation that for triton fused_moe kernel tuning for AMD Instinct GPUs #2191

Merged
merged 9 commits into from
Nov 27, 2024

Conversation

kkHuang-amd
Copy link
Contributor

Motivation

Updated Documentation for triton fused_moe kernel tuning for AMD Instinct GPUs.

Modifications

  • Upload a tuning script file
  • introduce the tuning parameters setting
  • Provided example bash commands for tuning script run

Checklist

  • [O] Update documentation as needed, including docstrings or example tutorials.

3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
3rdparty/amd/tuning/TUNING.md Outdated Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
@ispobock
Copy link
Collaborator

Thanks for contributing this great tuning script! Could we use this script to tune on other devices? Any general steps for adaptation?

3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
3rdparty/amd/tuning/benchmark_moe_rocm.py Outdated Show resolved Hide resolved
@kkHuang-amd
Copy link
Contributor Author

Thanks for contributing this great tuning script! Could we use this script to tune on other devices? Any general steps for adaptation?

Yes, it can be used on other devices, but some parameter is the ROCm platform only, it needs to be removed when using on other platforms.

These two parameters only exist on ROCm

  1. matrix_instr_nonkdim_range
  2. kpack_range

@HaiShaw
Copy link
Collaborator

HaiShaw commented Nov 26, 2024

@ispobock Later we may come up with a generic script for more devices, after we collect device specific triton kargs extensions.
If this one merged first, we can modify the script pointing to the newer one later.

from tqdm import tqdm
from transformers import AutoConfig

from sglang.srt.layers.fused_moe_grok.fused_moe import fused_moe, get_config_file_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hold on
fused_moe_grok will be removed soon

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref #2223

@zhyncs
Copy link
Member

zhyncs commented Nov 27, 2024

ref #2225

Copy link
Collaborator

@HaiShaw HaiShaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@HaiShaw HaiShaw enabled auto-merge (squash) November 27, 2024 18:00
@zhyncs zhyncs disabled auto-merge November 27, 2024 18:16
@zhyncs zhyncs merged commit a9ca297 into sgl-project:main Nov 27, 2024
1 check passed
@zhyncs
Copy link
Member

zhyncs commented Nov 27, 2024

Thanks @kkHuang-amd @HaiShaw May you help update the doc from fused_moe_grok to fused_moe_triton in follow-up PR? Thanks!

@HaiShaw
Copy link
Collaborator

HaiShaw commented Nov 27, 2024

@zhyncs certainly, also notified @merrymercy for #2223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants