-
Notifications
You must be signed in to change notification settings - Fork 183
Remove uses of AITER_ASM_DIR #1900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
458e295 to
83c8f2d
Compare
|
@valarLip How much memory does your MI350 node have. It seems that op_tests/test_mla_persistent.py is broken before my change on MI300 and I cannot test it on MI350 because 36 GiB is not enough? |
|
@yuguo68 This PR from our xla team should completely remove the AITER_ASM_DIR env |
thanks, #1862 gets merged over the weekend and I am going to use it for OSS PyTorch aiter update. @draganmladjenovic could this PR build on top of #1862? |
| #include "asm_fmoe_configs.hpp" | ||
| #include "asm_fmoe_code_objects.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering when do we need both codegen files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still has few standalone kernels that are not in the tables but manually handled.
It should supersede it. It has completely different table format that almost completely in rodata and does not rely on file paths at all. |
5b96bd0 to
cf385ec
Compare
Embed code objects into binary. Use hipRegisterFatBinary to make it seamlessly work on multiple gpus. Make CFG tables read-only and AiterAsmKernels statically allocated.
cf385ec to
424e9fa
Compare
|
@valarLip Your CI is broken. It is just that you do exit(0) when the kernel is not present https://github.com/ROCm/aiter/blame/main/csrc/include/aiter_hip_common.h#L39 I've changed that to a abort during refactoring and I now cannot pass CI. |
Motivation
Makes sure that user doesn't have to distribute kernels nor set up AITER_ASM_DIR.
Technical Details
Embed code objects into binary. Use hipRegisterFatBinary to make it seamlessly work on multiple gpus. Make CFG tables read-only and AiterAsmKernels statically allocated.
Test Plan
Selected tests from op_tests on gfx942