[mlir] computeSliceParameters: Fix offset when m(0) != 0 #122492

mgehre-amd · 2025-01-10T16:40:56Z

For affine maps where m(0) != 0,
like affine_map<(d0) -> (d0 + 3) in

  %generic = linalg.generic
    {indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
                      affine_map<(d0) -> (d0)>],
     iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) {
    ^bb0(%in : f32, %out: f32):
      linalg.yield %in : f32
    } -> tensor<6xf32>

tiling currently computes the wrong slice offsets. When tiling above example with a size of 3, it would compute

scf.for %i = ...
  %slice = tensor.extract_slice %arg0[%i + 3] [6] [1]
  linalg.generic
    {indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
                      affine_map<(d0) -> (d0)>],
     iterator_types = ["parallel"]} ins(%slice: tensor<6xf32>)

and thus apply the +3 twice (once in the extract slice and a second time in the linalg.generic).

This PR fixes this to yield an offset of
tensor.extract_slice %arg0[%i] [6] [1] instead.

like `affine_map<(d0) -> (d0 + 3)` in ``` %generic = linalg.generic {indexing_maps = [affine_map<(d0) -> (d0 + 3)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) { ^bb0(%in : f32, %out: f32): linalg.yield %in : f32 } -> tensor<6xf32> ``` tiling currently computes the wrong slice offsets. When tiling above example with a size of 3, it would compute ``` scf.for %i = ... %slice = tensor.extract_slice %arg0[%i + 3] [6] [1] linalg.generic {indexing_maps = [affine_map<(d0) -> (d0 + 3)>, affine_map<(d0) -> (d0)>], iterator_types = ["parallel"]} ins(%slice: tensor<6xf32>) ``` and thus apply the `+3` twice (once in the extract slice and a second time in the linalg.generic). This PR fixes this to yield an offset of `tensor.extract_slice %arg0[%i] [6] [1]` instead.

llvmbot · 2025-01-10T16:41:32Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Matthias Gehre (mgehre-amd)

Changes

For affine maps where m(0) != 0,
like affine_map<(d0) -> (d0 + 3) in

  %generic = linalg.generic
    {indexing_maps = [affine_map&lt;(d0) -&gt; (d0 + 3)&gt;,
                      affine_map&lt;(d0) -&gt; (d0)&gt;],
     iterator_types = ["parallel"]} ins(%arg0: tensor&lt;9xf32&gt;) outs(%empty : tensor&lt;6xf32&gt;) {
    ^bb0(%in : f32, %out: f32):
      linalg.yield %in : f32
    } -&gt; tensor&lt;6xf32&gt;

tiling currently computes the wrong slice offsets. When tiling above example with a size of 3, it would compute

scf.for %i = ...
  %slice = tensor.extract_slice %arg0[%i + 3] [6] [1]
  linalg.generic
    {indexing_maps = [affine_map&lt;(d0) -&gt; (d0 + 3)&gt;,
                      affine_map&lt;(d0) -&gt; (d0)&gt;],
     iterator_types = ["parallel"]} ins(%slice: tensor&lt;6xf32&gt;)

and thus apply the +3 twice (once in the extract slice and a second time in the linalg.generic).

This PR fixes this to yield an offset of
tensor.extract_slice %arg0[%i] [6] [1] instead.

Full diff: https://github.com/llvm/llvm-project/pull/122492.diff

2 Files Affected:

(modified) mlir/lib/Dialect/Linalg/Utils/Utils.cpp (+10-1)
(added) mlir/test/Dialect/Linalg/tile-offset.mlir (+25)

diff --git a/mlir/lib/Dialect/Linalg/Utils/Utils.cpp b/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
index 38e427af1c4846..2ebeccb1505048 100644
--- a/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+++ b/mlir/lib/Dialect/Linalg/Utils/Utils.cpp
@@ -596,8 +596,17 @@ computeSliceParameters(OpBuilder &builder, Location loc, Value valueToTile,
     auto m = map.getSubMap({r});
     LLVM_DEBUG(llvm::dbgs() << "computeSliceParameters: submap: " << m << "\n");
     IRRewriter rewriter(builder);
-    OpFoldResult offset = makeComposedFoldedAffineApply(rewriter, loc, m, lbs);
+    // The offset of the slice is map(lbs) - map(0).
+    SmallVector<Attribute> zeros(lbs.size(), rewriter.getIndexAttr(0));
+    SmallVector<Attribute> mAtZero;
+    auto res = m.constantFold(zeros, mAtZero);
+    assert(succeeded(res));
+    auto atZeroInt = getConstantIntValue(mAtZero[0]);
+    assert(atZeroInt);
+    OpFoldResult offset = makeComposedFoldedAffineApply(
+        rewriter, loc, m.getResult(0) - *atZeroInt, lbs);
     sliceParams.offsets.push_back(offset);
+
     OpFoldResult closedIntSize =
         makeComposedFoldedAffineApply(rewriter, loc, m, subShapeSizes);
     // Resulting size needs to be made half open interval again.
diff --git a/mlir/test/Dialect/Linalg/tile-offset.mlir b/mlir/test/Dialect/Linalg/tile-offset.mlir
new file mode 100644
index 00000000000000..41763dd688c1e9
--- /dev/null
+++ b/mlir/test/Dialect/Linalg/tile-offset.mlir
@@ -0,0 +1,25 @@
+// RUN: mlir-opt %s -transform-interpreter -split-input-file | FileCheck %s
+
+// CHECK-LABEL: func @tile_offset
+//  CHECK-SAME:   %[[ARG0:[a-zA-Z0-9_]+]]:
+func.func @tile_offset(%arg0 : tensor<9xf32>) -> tensor<6xf32> {
+  %empty = tensor.empty() : tensor<6xf32>
+  // CHECK: scf.for %[[ITER:[a-zA-Z0-9_]+]] =
+  // CHECK: tensor.extract_slice %[[ARG0]][%[[ITER]]] [6] [1]
+  %generic = linalg.generic
+    {indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
+                      affine_map<(d0) -> (d0)>],
+     iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) {
+    ^bb0(%in : f32, %out: f32):
+      linalg.yield %in : f32
+    } -> tensor<6xf32>
+  return %generic : tensor<6xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
+    %1, %loop = transform.structured.tile_using_for %0 tile_sizes [3] : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
+    transform.yield
+  }
+}

MaheshRavishankar · 2025-01-13T01:53:48Z

mlir/lib/Dialect/Linalg/Utils/Utils.cpp

+    auto res = m.constantFold(zeros, mAtZero);
+    assert(succeeded(res));


the res is not used anyway... So in non-debug builds this will show up as warning/error (if warnings are converted to error). What is the constantFold for?

Could you also add a message to the assert.

The m.constantFold is evaluating the affine map m with all inputs being zero.
The constantFold returns failure() when the affine map has symbols, but afaik linalg guarantees that the affine map has no symbols, thus the evaluation of m into a constant is always possible. (Thus the assert; I added a message to it).

I also added [[maybe_unused]] to res to avoid errors in non-debug configs.

mlir/lib/Dialect/Linalg/Utils/Utils.cpp

mgehre-amd · 2025-01-20T12:43:11Z

Friendly ping :-)

MaheshRavishankar

Thanks! Took me a while to work through if this is the easiest way of doing this. I cant think of anything better.

mgehre-amd requested review from dcaballe, nicolasvasilache and rengolin as code owners January 10, 2025 16:40

llvmbot added mlir:linalg mlir labels Jan 10, 2025

MaheshRavishankar reviewed Jan 13, 2025

View reviewed changes

Add assert message; make integer extraction more direct

a01658f

mgehre-amd requested a review from MaheshRavishankar January 13, 2025 08:59

Add [[maybe_unused]] to bool result

e0201e9

MaheshRavishankar approved these changes Jan 20, 2025

View reviewed changes

mgehre-amd merged commit 67b9d3f into llvm:main Jan 21, 2025
8 checks passed

mgehre-amd deleted the matthias.fix_tiling_slice_offset_upstream branch January 21, 2025 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir] computeSliceParameters: Fix offset when m(0) != 0 #122492

[mlir] computeSliceParameters: Fix offset when m(0) != 0 #122492

mgehre-amd commented Jan 10, 2025

llvmbot commented Jan 10, 2025 •

edited

Loading

MaheshRavishankar Jan 13, 2025

mgehre-amd Jan 13, 2025 •

edited

Loading

mgehre-amd Jan 13, 2025

mgehre-amd commented Jan 20, 2025

MaheshRavishankar left a comment

		auto res = m.constantFold(zeros, mAtZero);
		assert(succeeded(res));

[mlir] computeSliceParameters: Fix offset when m(0) != 0 #122492

[mlir] computeSliceParameters: Fix offset when m(0) != 0 #122492

Conversation

mgehre-amd commented Jan 10, 2025

llvmbot commented Jan 10, 2025 • edited Loading

MaheshRavishankar Jan 13, 2025

Choose a reason for hiding this comment

mgehre-amd Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

mgehre-amd Jan 13, 2025

Choose a reason for hiding this comment

mgehre-amd commented Jan 20, 2025

MaheshRavishankar left a comment

Choose a reason for hiding this comment

llvmbot commented Jan 10, 2025 •

edited

Loading

mgehre-amd Jan 13, 2025 •

edited

Loading