Skip to content

Commit

Permalink
benchdnn: inputs: graph: add compressed sdpa with per-channel quant q…
Browse files Browse the repository at this point in the history
…uant
  • Loading branch information
wzt1997 committed Dec 26, 2024
1 parent a9372c9 commit 7d7be8a
Show file tree
Hide file tree
Showing 2 changed files with 429 additions and 0 deletions.
1 change: 1 addition & 0 deletions tests/benchdnn/inputs/graph/complex_fusion/harness_mha_all
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
--reset --dt=f32,bf16,f16 --in-shapes=0:acbd+1:acbd+8:acbd --case=complex_fusion/mha/sdpa-plain-simplified-f16.json
--reset --dt=f32,bf16,f16 --in-shapes=3:384,3:384x384,3:1x16x384x384 --case=complex_fusion/mha/sdpa-plain-scale-by-mul-f16.json
--reset --op-attrs=34107656704:group_shape:1x1x1x32+34107654464:transpose_b:1 --in-shapes=0:1x32x32x128+1:1x32x32x4+2:1x32x32x4 --case=complex_fusion/mha/sdpa-compressed-k-int8-gs32.json
--reset --op-attrs=34107656704:axis:2+34107654464:transpose_b:1 --in-shapes=0:1x32x32x128 --case=complex_fusion/mha/sdpa-compressed-k-int8-per-channel.json

# Re-written int8 graphs
--reset --in-shapes=5:4x16x32x256+4:4x16x256x33+0:4x16x33x256+1:4x1x1x33+3:4x1x32x33 --case=complex_fusion/mha/MHA-GPT-inf-int8-bs1.json
Expand Down
Loading

0 comments on commit 7d7be8a

Please sign in to comment.