Logcumsumexp has different results between CPU and XPU on BF16/Complex64/Complex128 #1012

LuFinch · 2024-10-23T02:07:22Z

🐛 Describe the bug

BF16:

PYTORCH_TEST_WITH_SLOW=1 python test/xpu/extended/test_ops_xpu.py TestCommonXPU.test_compare_cpu_logcumsumexp_xpu_bfloat16

Mismatched elements: 2 / 125 (1.6%)
Greatest absolute difference: 0.03125 at index (1, 4, 2) (up to 0.001 allowed)
Greatest relative difference: 0.006072998046875 at index (2, 3, 1) (up to 0.001 allowed)

cpu output at  (1, 4, 2): tensor(6.1875, dtype=torch.bfloat16)
xpu output at  (1, 4, 2): tensor(6.1562, device='xpu:0', dtype=torch.bfloat16)

Complex128

PYTORCH_TEST_WITH_SLOW=1 python test/xpu/extended/test_ops_xpu.py TestCommonXPU.test_compare_cpu_logcumsumexp_xpu_complex128

Mismatched elements: 2 / 125 (1.6%)
Greatest absolute difference: 12.566370614359174 at index (3, 3, 0) (up to 0.001 allowed)
Greatest relative difference: 1.5103243157406059 at index (3, 4, 0) (up to 0.001 allowed)

cpu output at (3, 3, 0): tensor(7.4356+3.7336j, dtype=torch.complex128)
xpu output at (3, 3, 0): tensor(7.4356-8.8328j, device='xpu:0', dtype=torch.complex128)

Complex64

test_reductions_xpu.py::TestReductionsXPU::test_logcumsumexp_complex_xpu_complex64 

Mismatched elements: 1 / 3 (33.3%)
Greatest absolute difference: nan at index (2,) (up to 1e-05 allowed)
Greatest relative difference: nan at index (2,) (up to 1.3e-06 allowed)

input : [1e3 + 0j, 1e-18 + 1e4j, 1e2 + 1e-8j] 
cpu_output :  [1000.+0.j, 1000.+0.j, 1000.+0.j]
cuda_output : [1000.+0.j, 1000.+0.j, 1000.+0.j]
xpu_output : [1000.+0.j, 1000.+0.j, nan + nanj]

For complex64, I found that the nan issue in complex64 is caused by accumulated order: our xpu scan kernel would firstly reduce input[1], input[2], then reduce input[0], input[2] in this case. However, even cpu kernel will output [nan, nanj] when directly calculating logcumsumexp(input[1], input[2]).

Versions

Related PR: #931

The text was updated successfully, but these errors were encountered:

LuFinch linked a pull request Oct 23, 2024 that will close this issue

Add aten::cummax, aten::cummin， aten::logcumsumexp #931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logcumsumexp has different results between CPU and XPU on BF16/Complex64/Complex128 #1012

Logcumsumexp has different results between CPU and XPU on BF16/Complex64/Complex128 #1012

LuFinch commented Oct 23, 2024

Logcumsumexp has different results between CPU and XPU on BF16/Complex64/Complex128 #1012

Logcumsumexp has different results between CPU and XPU on BF16/Complex64/Complex128 #1012

Comments

LuFinch commented Oct 23, 2024

🐛 Describe the bug

Versions