Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diff-Taichi error in complicated loop situations #7771

Open
liamxu1 opened this issue Apr 10, 2023 · 4 comments
Open

diff-Taichi error in complicated loop situations #7771

liamxu1 opened this issue Apr 10, 2023 · 4 comments
Assignees
Labels
question Question on using Taichi

Comments

@liamxu1
Copy link

liamxu1 commented Apr 10, 2023

I am trying to differentiate a complicated calculation with diff-Taichi, which includes nested range-for statements where the inside loop range depends on the outside loop index. Besides, there can be lots of condition statements.

I tried to reproduce the error with simple codes and below is what I got.

import taichi as ti
ti.init(arch=ti.cpu, debug=True)

NUM = 5
x = ti.field(dtype=float)
y = ti.field(dtype=float)
ti.root.dense(ti.i, NUM).place(x, y)
loss = ti.field(dtype=float, shape=())
ti.root.lazy_grad()

@ti.kernel
def initialize():
    for i in x:
        x[i] = i
        y[i] = 0
    loss[None] = 0

@ti.kernel
def compute_loss():
    for i in range(NUM):
        l = 0.0
        for j in range(i):
            for k in range(j):
                l += x[j] * x[k]
        y[i] = l
            
    for i in range(NUM):
        loss[None] += y[i]

initialize()
with ti.ad.Tape(loss=loss, validation=True):
    compute_loss()

In both cuda and x64 backend, the code fails with information

[Taichi] version 1.5.0, llvm 15.0.1, commit 7b885c28, win, python 3.9.16
[Taichi] Starting on arch=x64
Instruction does not dominate all uses!
  %35 = load i32, ptr %9, align 4
  %125 = sub i32 %35, 1
[E 04/10/23 13:52:33.367 29352] [codegen_llvm.cpp:taichi::lang::FunctionCreationGuard::~FunctionCreationGuard@76] Assertion failure: !llvm::verifyFunction(*body, &llvm::errs())


***********************************
* Taichi Compiler Stack Traceback *
***********************************
0x7ffd7f4cc7f9: taichi::Time::get_cycles in taichi_python.cp39-win_amd64.pyd
0x7ffd7fe22969: taichi::Logger::error in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5fd00a: taichi::lang::directx12::make_aot_module in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a143f: taichi::Time::get_cycles in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a24a3: taichi::Time::get_cycles in taichi_python.cp39-win_amd64.pyd
0x7ffd7f61a2be: taichi::lang::directx12::make_aot_module in taichi_python.cp39-win_amd64.pyd
0x7ffd7f61a3c5: taichi::lang::directx12::make_aot_module in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a35d0: taichi::Time::get_cycles in taichi_python.cp39-win_amd64.pyd
0x7ffd7f2a130b: taichi::lang::aot::Module::get_snode_tree in taichi_python.cp39-win_amd64.pyd
0x7ffd7f3a3a49: taichi::lang::Ndarray::write_float in taichi_python.cp39-win_amd64.pyd
0x7ffd7f3a53e1: taichi::lang::Ndarray::write_float in taichi_python.cp39-win_amd64.pyd
0x7ffe1b869363: recalloc in ucrtbase.dll
0x7ffe1c0e26bd: BaseThreadInitThunk in KERNEL32.DLL
0x7ffe1dc2a9f8: RtlUserThreadStart in ntdll.dll

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi-lang.org/docs/install
[E 04/10/23 13:52:33.435 29352] Received signal 22 (SIGABRT)


***********************************
* Taichi Compiler Stack Traceback *
***********************************
0x7ffd7f4cc7f9: ?get_cycles@Time@taichi@@SA_KXZ in taichi_python.cp39-win_amd64.pyd
0x7ffd7fe22969: ?error@Logger@taichi@@QEAAXAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@_N@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f4acea1: ?readback_data@Device@lang@taichi@@UEAA?AW4RhiResult@23@PEAUDevicePtr@23@PEAPEAXPEA_KHAEBV?$vector@V?$shared_ptr@VStreamSemaphoreObject@lang@taichi@@@std@@V?$allocator@V?$shared_ptr@VStreamSemaphoreObject@lang@taichi@@@std@@@2@@std@@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffe1b8be5f5: raise in ucrtbase.dll
0x7ffe1b8bf601: abort in ucrtbase.dll
0x7ffe1b8bed39: terminate in ucrtbase.dll
0x7ffe0e511abf: __NLG_Return2 in VCRUNTIME140_1.dll
0x7ffe0e51232b: __NLG_Return2 in VCRUNTIME140_1.dll
0x7ffe0e5140e9: __CxxFrameHandler4 in VCRUNTIME140_1.dll
0x7ffd7ff0de10: ?from_fs_dir@VirtualDir@io@taichi@@SA?AV?$unique_ptr@UVirtualDir@io@taichi@@U?$default_delete@UVirtualDir@io@taichi@@@std@@@std@@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@5@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffe1dc73cff: __chkstk in ntdll.dll
0x7ffe1dbee456: RtlFindCharInUnicodeString in ntdll.dll
0x7ffe1dc24455: RtlRaiseException in ntdll.dll
0x7ffe1b52fe7c: RaiseException in KERNELBASE.dll
0x7ffdf40a6220: _CxxThrowException in VCRUNTIME140.dll
0x7ffd7fe229ab: ?error@Logger@taichi@@QEAAXAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@_N@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5fd00a: ?make_aot_module@directx12@lang@taichi@@YA?AV?$unique_ptr@VModule@aot@lang@taichi@@U?$default_delete@VModule@aot@lang@taichi@@@std@@@std@@Vany@5@W4Arch@3@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a143f: ?get_cycles@Time@taichi@@SA_KXZ in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a24a3: ?get_cycles@Time@taichi@@SA_KXZ in taichi_python.cp39-win_amd64.pyd
0x7ffd7f61a2be: ?make_aot_module@directx12@lang@taichi@@YA?AV?$unique_ptr@VModule@aot@lang@taichi@@U?$default_delete@VModule@aot@lang@taichi@@@std@@@std@@Vany@5@W4Arch@3@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f61a3c5: ?make_aot_module@directx12@lang@taichi@@YA?AV?$unique_ptr@VModule@aot@lang@taichi@@U?$default_delete@VModule@aot@lang@taichi@@@std@@@std@@Vany@5@W4Arch@3@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f5a35d0: ?get_cycles@Time@taichi@@SA_KXZ in taichi_python.cp39-win_amd64.pyd
0x7ffd7f2a130b: ?get_snode_tree@Module@aot@lang@taichi@@QEAAPEAVField@234@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f3a3a49: ?write_float@Ndarray@lang@taichi@@QEAAXAEBV?$vector@HV?$allocator@H@std@@@std@@N@Z in taichi_python.cp39-win_amd64.pyd
0x7ffd7f3a53e1: ?write_float@Ndarray@lang@taichi@@QEAAXAEBV?$vector@HV?$allocator@H@std@@@std@@N@Z in taichi_python.cp39-win_amd64.pyd
0x7ffe1b869363: _recalloc in ucrtbase.dll
0x7ffe1c0e26bd: BaseThreadInitThunk in KERNEL32.DLL
0x7ffe1dc2a9f8: RtlUserThreadStart in ntdll.dll

Internal error occurred. Check out this page for possible solutions:
https://docs.taichi-lang.org/docs/install

However, if I change the code to

@ti.kernel
def compute_loss():
    for i in range(NUM):
        for j in range(i):
            for k in range(j):
                y[i] += x[j] * x[k]
            
    for i in range(NUM):
        loss[None] += y[i]

the code just works fine.

I am wondering what is the suggested way to write such loops. Am I violating the rule "Kernel body must only consist of either multiple for-loops or non-for statements" or is it something else? In addition, I noticed that the code may also fail if I use "for ... continue ...", are there any more rules I should know about when dealing with for statements and if statements by Taichi autodiff?

e.g. The below code will fail

import taichi as ti
ti.init(arch=ti.cpu, debug=True)

NUM = 5
x = ti.field(dtype=float)
ti.root.dense(ti.i, NUM).place(x)
loss = ti.field(dtype=float, shape=())
ti.root.lazy_grad()

@ti.kernel
def initialize():
    for i in x:
        x[i] = i
    loss[None] = 0

@ti.kernel
def compute_loss():
    for i in range(NUM):
        if i > NUM / 2:
            continue
        loss[None] += x[i]

initialize()
with ti.ad.Tape(loss=loss, validation=True):
    compute_loss()

Looking forward to your reply. Many thanks.

@erizmr
Copy link
Contributor

erizmr commented Apr 10, 2023

Hi @liamxu1 , thanks for reporting this. The main issue you mentioned here is confirmed as a bug. The fix is submitted in PR #7778 .

The for...continue and while statement are not supported currently in Taichi autodiff.

Though the control flows can be handled, the pure nested loops as your modification are more friendly to the autodiff system.

@liamxu1
Copy link
Author

liamxu1 commented Apr 10, 2023

Thanks for the fast reply. I will give it a try.

@liamxu1
Copy link
Author

liamxu1 commented Apr 11, 2023

Excuse me @erizmr, is there a way I can experience the fixed version? I tried to clone the code in your branch and follow https://docs.taichi-lang.org/docs/dev_install to compile the code, but it failed with a DLL loading error.

@erizmr
Copy link
Contributor

erizmr commented Apr 13, 2023

Hi @liamxu1 , could you please share more error info e.g., screenshots ?

@lin-hitonami lin-hitonami moved this from Untriaged to Todo in Taichi Lang Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question on using Taichi
Projects
Status: Todo
Development

No branches or pull requests

2 participants