Skip to content

Conversation

@ukkz
Copy link

@ukkz ukkz commented Jan 12, 2026

Problem

When transposing a tensor with int32 input, the WebGPU backend fails with WGSL shader compilation warning and produces incorrect output values.

Error: type mismatch for argument 2 in call to 'setOutputAtIndex', expected 'f32', got 'i32'

This occurs because A[...] returns i32 when input is int32, but setOutputAtIndex expects f32.

Technical Background

The WebGPU backend stores all tensor data as float32 in GPU buffers by design. The output buffer is always declared as array<f32>, and setOutputAtIndex (as well as setOutputAtIndexI32) always writes f32 values. This means TransposeProgram must always output f32.

However, certain operations like OneHot produce int32 tensors at runtime, even when the model graph declares float types. When such an int32 tensor flows into TransposeProgram, the shader reads from A: array<i32> but attempts to pass the i32 value directly to setOutputAtIndex, which expects f32. Since WGSL is strictly typed and does not allow implicit type conversion, the shader compilation fails.

Fix

Add explicit f32() cast around the array access.

Notes

  • The f32 cast is correct for all cases because the WebGPU backend always stores output as float32
  • This is a minimal fix for models (e.g., ONNX-converted) that have int32 intermediate tensors from operations like OneHot
  • Backward compatible: f32() on float32 is a no-op

ukkz added 2 commits January 12, 2026 18:11
When transposing an int32 tensor, the WGSL shader fails with type mismatch warning and produces incorrect output values.

Add explicit f32() cast. This assumes float32 output (standard case).
int32 output is not tested or expected in the current codebase.
Test cases added:
- int32 2D, 3D, and 5D transpose
- bool 2D transpose

These tests verify that non-float32 input tensors can be correctly transposed with proper shape and dtype preservation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant