Test horizontal matmul fusion in Llama2FFN test #3610

jacobhinkle · 2024-12-18T16:27:57Z

This removes some barriers to horizontal fusion and updates the test which is currently Ampere-only.

Note that most of the horizontal fusion code hasn't been exercised much so we might continue hitting small snags as we start using it more. My intention with this PR is to test it automatically by modifying the test. Likewise, we will need changes to the canSchedule checks and default heuristics to ensure sane behavior when doing horizontal fusions, so there will likely be more PRs of this flavor soon.

This removes some barriers to horizontal fusion and updates the test which is currently Ampere-only.

jacobhinkle · 2024-12-18T16:28:03Z

!test

jacobhinkle · 2024-12-18T16:34:23Z

csrc/scheduler/matmul_utils.cpp

@@ -750,14 +760,21 @@ std::unique_ptr<MatmulParams> getMatmulHeuristics(
        problem_shape[(size_t)MatmulDimRole::Batch],
        inner_dims,
        tensor_roles);
+    // TODO: more sophisticated handling of multiple matmuls when using plugin
+    mparams->tile_sizes.cta_tile.m /= patterns.size();


Ideally the heuristic would understand that we might have multiple matmuls being computed in a single main loop, and that the operands for that main loop can be loaded simultaneously. For example if the A operand is used in two matmuls then we will have 3 operands loaded instead of 3, meaning we should make the CTA tile at most 2/3 as large as it would be for a single matmul. This is more conservative for now to ensure we don't run out of smem.

jacobhinkle · 2024-12-19T17:05:06Z

!build

rdspring1

LGTM to my naive eyes.

Comment: We have many options for matmuls like FuseMultipleMatmuls and FuseMatmul. Is it because nvfuser accepts matmuls, but they are routed to aten by default? I wonder if online tuning would be useful to pick between aten and nvfuser.

csrc/scheduler/matmul_utils.cpp

Co-authored-by: Ryan Spring <[email protected]>

jacobhinkle · 2024-12-19T17:53:21Z

Comment: We have many options for matmuls like FuseMultipleMatmuls and FuseMatmul. Is it because nvfuser accepts matmuls, but they are routed to aten by default? I wonder if online tuning would be useful to pick between aten and nvfuser.

No it should not be a scheduler decision I think. The intention is for us to trust our matmul fusion enough to enable both of these by default and convert them to DisableOptions. It is true that we might have more matmul options related to fusion though, so maybe we could make multiple_matmuls an argument to the fuse_matmul option, so we could use it like NVFUSER_ENABLE=fuse_matmul(multiple_matmuls) instead.

jacobhinkle · 2024-12-19T17:53:29Z

!build

Test horizontal matmul fusion in Llama2FFN test

e57140d

This removes some barriers to horizontal fusion and updates the test which is currently Ampere-only.

jacobhinkle commented Dec 18, 2024

View reviewed changes

jacobhinkle added 2 commits December 19, 2024 10:08

Remove check in ampere scheduler that operands only used once

6dafb7b

clang-tidy

09d196f

jacobhinkle marked this pull request as ready for review December 19, 2024 15:38

jacobhinkle requested a review from rdspring1 December 19, 2024 15:38

rdspring1 approved these changes Dec 19, 2024

View reviewed changes

csrc/scheduler/matmul_utils.cpp Outdated Show resolved Hide resolved

Update csrc/scheduler/matmul_utils.cpp

fde7da1

Co-authored-by: Ryan Spring <[email protected]>

jacobhinkle merged commit 962f002 into main Dec 19, 2024
17 checks passed

jacobhinkle deleted the horizontal_matmul_fusion branch December 19, 2024 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test horizontal matmul fusion in Llama2FFN test #3610

Test horizontal matmul fusion in Llama2FFN test #3610

jacobhinkle commented Dec 18, 2024 •

edited

Loading

jacobhinkle commented Dec 18, 2024

jacobhinkle Dec 18, 2024

jacobhinkle commented Dec 19, 2024

rdspring1 left a comment

jacobhinkle commented Dec 19, 2024

jacobhinkle commented Dec 19, 2024

Test horizontal matmul fusion in Llama2FFN test #3610

Test horizontal matmul fusion in Llama2FFN test #3610

Conversation

jacobhinkle commented Dec 18, 2024 • edited Loading

jacobhinkle commented Dec 18, 2024

jacobhinkle Dec 18, 2024

Choose a reason for hiding this comment

jacobhinkle commented Dec 19, 2024

rdspring1 left a comment

Choose a reason for hiding this comment

jacobhinkle commented Dec 19, 2024

jacobhinkle commented Dec 19, 2024

jacobhinkle commented Dec 18, 2024 •

edited

Loading