Add support for dynamic strides by suryajasper · Pull Request #891 · iree-org/wave

suryajasper · 2026-02-17T00:55:12Z

Problem

Wave treats all tensor arguments as if they are contiguous. For a non-contiguous input tensor such as A = torch.randn((M, K * 4))[:, :K], while the shape is (M, K), the strides are (K * 4, 1). Since wave runtime only passes a pointer to the linearized tensor data in memory without any information on the strides, the input buffers will be addressed incorrectly, resulting in numerical inaccuracies.

Solution

This PR provides universal support for non-contiguous input & output tensor layouts through wave runtime. Wave automatically passes one stride per dimension per buffer through the pipeline and uses them in the kernel IR so loads/stores use the correct layout.

Provide extra index arguments for strides
Emitter builds memref.reinterpret_cast with those stride values and a layout like strided<[?, 1]> (dynamic leading strides, unit stride on the last dim to satisfy vector.load).
During runtime invocation, Wave parses the tensor metadata to pass in the appropriate strides for each dimension; We extend the kernarg buffer in the C++ runtime with a strides section after pointers, scalars, and dynamic dims.

Validation

Added new test: dynamic_strides_test.py

Signed-off-by: Surya Jasper <45545431+suryajasper@users.noreply.github.com>

- Removed compile option (enabled by default) - Added lit test - Added new pytests for non-contiguous output & strided w/ offsets Signed-off-by: Surya Jasper <45545431+suryajasper@users.noreply.github.com>

raikonenfnu · 2026-02-17T22:07:09Z

tests/kernel/asm_backend_test.py


 @require_e2e
 @require_cdna_3_or_4
+@pytest.mark.xfail(reason="Dynamic strides are not supported in the ASM backend yet")


Can we set dynamic_stride=False instead of xfailing?

raikonenfnu · 2026-02-17T22:07:51Z

wave_lang/kernel/compiler/wave_codegen/emitter.py

+        if self.options.wave_runtime:
+            stride_arg_count = sum(
+                len(b.kernel_buffer_type.symbolic_shape)
+                for b in self.root_sig.sig.kernel_buffer_bindings
+            )
+            if stride_arg_count > 0:
+                arg_types += [IndexType.get()] * stride_arg_count


does dynamic stride not work with non wave_runtime?

raikonenfnu · 2026-02-17T22:09:37Z

wave_lang/kernel/compiler/wave_codegen/emitter.py

+                    if rank == 1:
+                        stride_vals = [arith_d.constant(IndexType.get(), 1)]
+                        static_strides = [1]
+                        layout = StridedLayoutAttr.get(offset=dyn_val, strides=[1])


Does the latter case not work when rank==1?

raikonenfnu · 2026-02-17T22:11:55Z

wave_lang/kernel/wave/utils/run_utils.py

+        for d in range(arg_tensor.dim()):
+            stride_values.append(arg_tensor.stride(d))


is it possible to do .stride() instead of doing for d in range(arg_tensor.dim())?

suryajasper added 3 commits February 17, 2026 00:18

Pass in dynamic stride arguments at compile & runtime

a9b84ac

Signed-off-by: Surya Jasper <45545431+suryajasper@users.noreply.github.com>

Use dynamic strides in memref.reinterpret_cast

c4939d0

Signed-off-by: Surya Jasper <45545431+suryajasper@users.noreply.github.com>

Use dynamic strides universally w/ wave runtime

9806b70

- Removed compile option (enabled by default) - Added lit test - Added new pytests for non-contiguous output & strided w/ offsets Signed-off-by: Surya Jasper <45545431+suryajasper@users.noreply.github.com>

raikonenfnu reviewed Feb 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add support for dynamic strides#891

Add support for dynamic strides#891
suryajasper wants to merge 3 commits intoiree-org:mainfrom
suryajasper:dynamic-strides

suryajasper commented Feb 17, 2026 •

edited

Loading

Uh oh!

raikonenfnu Feb 17, 2026

Uh oh!

raikonenfnu Feb 17, 2026

Uh oh!

raikonenfnu Feb 17, 2026

Uh oh!

raikonenfnu Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		for d in range(arg_tensor.dim()):
		stride_values.append(arg_tensor.stride(d))

Comments

Conversation

suryajasper commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Validation

Uh oh!

raikonenfnu Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

raikonenfnu Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

raikonenfnu Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

raikonenfnu Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

suryajasper commented Feb 17, 2026 •

edited

Loading