You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PYTORCH_TEST_WITH_SLOW=1 python test/xpu/extended/test_ops_xpu.py TestCommonXPU.test_compare_cpu_logcumsumexp_xpu_bfloat16
Mismatched elements: 2 / 125 (1.6%)
Greatest absolute difference: 0.03125 at index (1, 4, 2) (up to 0.001 allowed)
Greatest relative difference: 0.006072998046875 at index (2, 3, 1) (up to 0.001 allowed)
cpu output at (1, 4, 2): tensor(6.1875, dtype=torch.bfloat16)
xpu output at (1, 4, 2): tensor(6.1562, device='xpu:0', dtype=torch.bfloat16)
Complex128
PYTORCH_TEST_WITH_SLOW=1 python test/xpu/extended/test_ops_xpu.py TestCommonXPU.test_compare_cpu_logcumsumexp_xpu_complex128
Mismatched elements: 2 / 125 (1.6%)
Greatest absolute difference: 12.566370614359174 at index (3, 3, 0) (up to 0.001 allowed)
Greatest relative difference: 1.5103243157406059 at index (3, 4, 0) (up to 0.001 allowed)
cpu output at (3, 3, 0): tensor(7.4356+3.7336j, dtype=torch.complex128)
xpu output at (3, 3, 0): tensor(7.4356-8.8328j, device='xpu:0', dtype=torch.complex128)
Complex64
test_reductions_xpu.py::TestReductionsXPU::test_logcumsumexp_complex_xpu_complex64
Mismatched elements: 1 / 3 (33.3%)
Greatest absolute difference: nan at index (2,) (up to 1e-05 allowed)
Greatest relative difference: nan at index (2,) (up to 1.3e-06 allowed)
input : [1e3 + 0j, 1e-18 + 1e4j, 1e2 + 1e-8j]
cpu_output : [1000.+0.j, 1000.+0.j, 1000.+0.j]
cuda_output : [1000.+0.j, 1000.+0.j, 1000.+0.j]
xpu_output : [1000.+0.j, 1000.+0.j, nan + nanj]
For complex64, I found that the nan issue in complex64 is caused by accumulated order: our xpu scan kernel would firstly reduce input[1], input[2], then reduce input[0], input[2] in this case. However, even cpu kernel will output [nan, nanj] when directly calculating logcumsumexp(input[1], input[2]).
🐛 Describe the bug
For complex64, I found that the nan issue in complex64 is caused by accumulated order: our xpu scan kernel would firstly reduce input[1], input[2], then reduce input[0], input[2] in this case. However, even cpu kernel will output [nan, nanj] when directly calculating logcumsumexp(input[1], input[2]).
Versions
Related PR: #931
The text was updated successfully, but these errors were encountered: