[GPU] Use scf.if for forall overhangs #19125

krzysz00 · 2024-11-12T23:24:29Z

In cases where we can't determine if the number of workitems per workgroup evenly divides the set of items that's required for an scf.forall, the current code uses
scf.for %i = %id to %upperBound step %numWorkitems in order to make the last loop iteration only run on the expected fraction of wworkitems.

This commit enables using linearize (and a step-1 loop) in the main body of the for loop the forall is being lowered to by switching to a post-loop if statement instead.

MaheshRavishankar

I see what you are doing here... it sort of makes sense... but there is some nuance here that @qedawkins was explaining to me some time. I'd wait for his review.

qedawkins

I see what you're going for but this feels like it might be a premature optimization to me. This means every time we resolve a forall we're duplicating the full body of the loop. This makes sense for simple cases like copies, but is probably overkill for larger loops. In other words, I think the decision to do this is tied to whether we want to unroll the loop.

Before landing this change, I'd like at least some signal that it's better. We can try using the ONNX model suite for this, there are some instructions for how to run it here

This will give us a report comparing model performance for a few models + overall model support numbers that can hopefully give some kind of signal.

krzysz00 · 2024-11-15T17:37:00Z

Yeah, agreed that this is
a) Not critical-path of the other changes I'm making and
b) Something that needs to "get perf'd"

... I could probably skip the if thing and move to linearize disjoint in the perfect-division case as well, which I'd expect to be roughly NFC

qedawkins · 2024-11-15T18:25:11Z

Yeah the use of linearize SGTM

krzysz00 · 2024-11-16T02:15:38Z

@qedawkins That page you linked 404s for me. Do I need to get added to something?

qedawkins · 2024-11-16T02:20:07Z

Oh shoot, yeah, let me remove that. Let's follow up offline because someone else will need to add you.

In cases where we can't determine if the number of workitems per workgroup evenly divides the set of items that's required for an scf.forall, the current code uses `scf.for %i = %id to %upperBound step %numWorkitems` in order to make the last loop iteration only run on the expected fraction of wworkitems. This commit enables using linearize (and a step-1 loop) in the main body of the for loop the forall is being lowered to by switching to a post-loop if statement instead.

krzysz00 requested review from MaheshRavishankar, qedawkins, kuhar, Groverkss and antiagainst as code owners November 12, 2024 23:24

krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from c3eabac to 5207e94 Compare November 13, 2024 21:15

krzysz00 requested a review from hanhanW as a code owner November 13, 2024 21:15

krzysz00 force-pushed the users/krzysz00/use-if-for-overhang branch from 247b62e to 94e406d Compare November 13, 2024 21:17

MaheshRavishankar reviewed Nov 14, 2024

View reviewed changes

qedawkins reviewed Nov 15, 2024

View reviewed changes

krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch 2 times, most recently from 5328767 to 5a8fa83 Compare November 21, 2024 20:54

krzysz00 force-pushed the users/krzysz00/gpu-distribute-with-linearize branch from 5a8fa83 to 291f570 Compare November 26, 2024 19:24

Base automatically changed from users/krzysz00/gpu-distribute-with-linearize to main November 26, 2024 20:38

krzysz00 force-pushed the users/krzysz00/use-if-for-overhang branch from 94e406d to 9368746 Compare November 26, 2024 23:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Use scf.if for forall overhangs #19125

[GPU] Use scf.if for forall overhangs #19125

krzysz00 commented Nov 12, 2024

MaheshRavishankar left a comment

qedawkins left a comment •

edited

Loading

krzysz00 commented Nov 15, 2024

qedawkins commented Nov 15, 2024

krzysz00 commented Nov 16, 2024

qedawkins commented Nov 16, 2024

[GPU] Use scf.if for forall overhangs #19125

Are you sure you want to change the base?

[GPU] Use scf.if for forall overhangs #19125

Conversation

krzysz00 commented Nov 12, 2024

MaheshRavishankar left a comment

Choose a reason for hiding this comment

qedawkins left a comment • edited Loading

Choose a reason for hiding this comment

krzysz00 commented Nov 15, 2024

qedawkins commented Nov 15, 2024

krzysz00 commented Nov 16, 2024

qedawkins commented Nov 16, 2024

qedawkins left a comment •

edited

Loading