Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in operation ordering between --EmitONNXIR and --EmitONNXBasic #3010

Open
flemairen6 opened this issue Nov 13, 2024 · 1 comment

Comments

@flemairen6
Copy link
Collaborator

I am trying to understand one of the optimization that seems to be running when using --EmitONNXIR compared to --EmitONNXBasic
If we take the following examples

      <
          ir_version: 8,
          opset_import: ["" : 19]
      >
      main (float[1,64,8] input) => (float[1,64,4] out0) {
          splitCst = Constant <value: tensor = int64[2] {4, 4}> ()
          cst4 = Constant <value = float {1.0}> ()
          cst5 = Constant <value = float {0.5}> ()
          split1, split2 = Split <axis: int = 2> (input, splitCst)
          add = Add(split1, cst4)
          mul3 = Mul(split1, add)
          mul4 = Mul(mul3, cst5)
          out0 = Mul(mul4, split2)
      }

and run onnx-mlir --EmitONNXBasic MyModel.onnx, I get:

  func.func @main_graph(%arg0: tensor<1x64x8xf32> {onnx.name = "input"}) -> (tensor<1x64x4xf32> {onnx.name = "out0"}) {
    %0 = onnx.Constant dense<4> : tensor<2xi64>
    %1 = onnx.Constant dense<1.000000e+00> : tensor<f32>
    %2 = onnx.Constant dense<5.000000e-01> : tensor<f32>
    %3:2 = "onnx.Split"(%arg0, %0) {axis = 2 : si64, onnx_node_name = "Split3"} : (tensor<1x64x8xf32>, tensor<2xi64>) -> (tensor<1x64x4xf32>, tensor<1x64x4xf32>)
    %4 = "onnx.Add"(%3#0, %1) {onnx_node_name = "Add4"} : (tensor<1x64x4xf32>, tensor<f32>) -> tensor<1x64x4xf32>
    %5 = "onnx.Mul"(%3#0, %4) {onnx_node_name = "Mul5"} : (tensor<1x64x4xf32>, tensor<1x64x4xf32>) -> tensor<1x64x4xf32>
    %6 = "onnx.Mul"(%5, %2) {onnx_node_name = "Mul6"} : (tensor<1x64x4xf32>, tensor<f32>) -> tensor<1x64x4xf32>
    %7 = "onnx.Mul"(%6, %3#1) {onnx_node_name = "Mul7"} : (tensor<1x64x4xf32>, tensor<1x64x4xf32>) -> tensor<1x64x4xf32>
    onnx.Return %7 : tensor<1x64x4xf32>
  }

Where the order of the Mul ops is the same as in the textual (or binary) version of the model (Notice that the results of Split is used in the first and last Mul)
Now, if I run onnx-mlir --EmitONNXIR MyModel.onnx, I have:

  func.func @main_graph(%arg0: tensor<1x64x8xf32> {onnx.name = "input"}) -> (tensor<1x64x4xf32> {onnx.name = "out0"}) {
    %0 = onnx.Constant dense<4> : tensor<2xi64>
    %1 = onnx.Constant dense<1.000000e+00> : tensor<f32>
    %2 = onnx.Constant dense<5.000000e-01> : tensor<f32>
    %3:2 = "onnx.Split"(%arg0, %0) {axis = 2 : si64, onnx_node_name = "Split3"} : (tensor<1x64x8xf32>, tensor<2xi64>) -> (tensor<1x64x4xf32>, tensor<1x64x4xf32>)
    %4 = "onnx.Add"(%3#0, %1) {onnx_node_name = "Add4"} : (tensor<1x64x4xf32>, tensor<f32>) -> tensor<1x64x4xf32>
    %5 = "onnx.Mul"(%3#0, %4) {onnx_node_name = "Mul5"} : (tensor<1x64x4xf32>, tensor<1x64x4xf32>) -> tensor<1x64x4xf32>
    %6 = "onnx.Mul"(%5, %3#1) {onnx_node_name = "Mul7-Constant2-Mul6_0"} : (tensor<1x64x4xf32>, tensor<1x64x4xf32>) -> tensor<1x64x4xf32>
    %7 = "onnx.Mul"(%6, %2) {onnx_node_name = "Mul7-Constant2-Mul6_1"} : (tensor<1x64x4xf32>, tensor<f32>) -> tensor<1x64x4xf32>
    return %7 : tensor<1x64x4xf32>
  }

In this case, the last Mul has been moved up compared to the previous IR.

I am trying to understand what pass or optimization could be causing this behaviour, and if there is a way to disable it without disabling other optimizations. Could someone help me with that?

Thanks a lot in advance!

@AlexandreEichenberger
Copy link
Collaborator

That feels like maybe constant propagation. Have you tried to run with the -mlir-print-ir-after-all? It may list the offending optimizations.

Also, if it list the hybird opt, where multiple opt are mashed together, we used to have an alternative to that pass, but it might have been yanked.

My recollection is that basic was before anything was done, and the other one was after shape inference. So are there reasons you still want to run with some but not all opts present in the EmitONNXIR?

We can probably adapt when the code is emitted for that later target, and/or making another target that emit after only what you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants