AdaptiveGEMM

This repository is based on the original DeepGEMM. In addition to that, we have added two key features:

Adaptation to Various Lengths of Group M: We have enhanced the repository to support different lengths of group M, providing more flexibility for diverse use cases.
Support for Group K GEMM: We have also added support for Group K GEMM operations, expanding the functionality of the original DeepGEMM. NOTE: k in each group should be padded to 128x.

Quick start

Requirements

Hopper architecture GPUs, sm_90a must be supported
Python 3.8 or above
CUDA 12.3 or above
- But we highly recommend 12.8 or above for the best performance
PyTorch 2.1 or above
CUTLASS 3.6 or above (could be cloned by Git submodule)

Development

# Submodule must be cloned
git clone --recursive [email protected]:InternLM/AdaptiveGemm.git

# Make symbolic links for third-party (CUTLASS and CuTe) include directories
python setup.py develop

# Test JIT compilation
python tests/test_jit.py

# Test all GEMM implements (normal, contiguous-grouped and masked-grouped)
python tests/test_varlen_groupm.py

Installation

python setup.py install

Citation

If you find our work useful, please cite:

@misc{su2025tmaadaptivefp8groupedgemm,
  title={TMA-Adaptive FP8 Grouped GEMM: Eliminating Padding Requirements in Low-Precision Training and Inference on Hopper}, 
  author={Zhongling Su and Rong Fu and Weihan Cao and Jianfei Gao and Minxi Jin and Zhilin Pei and Hui Wang},
  year={2025},
  eprint={2508.16584},
  archivePrefix={arXiv},
  primaryClass={cs.AR},
  url={https://arxiv.org/abs/2508.16584}, 
}

Original deepgemm

To get detailed information of original deepgeem, please refer to https://github.com/deepseek-ai/DeepGEMM

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
adaptive_gemm		adaptive_gemm
figures		figures
tests		tests
third-party		third-party
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AdaptiveGEMM

Quick start

Requirements

Development

Installation

Citation

Original deepgemm

About

Uh oh!

Releases

Packages

Languages

License

InternLM/AdaptiveGEMM

Folders and files

Latest commit

History

Repository files navigation

AdaptiveGEMM

Quick start

Requirements

Development

Installation

Citation

Original deepgemm

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages