Skip to content

v1.6.0

Compare
Choose a tag to compare
@github-actions github-actions released this 12 May 03:19
· 381 commits to master since this release

Deprecation Notice

  • We removed some APIs that were deprecated a long time ago. See the table below:
Removed API Replace with
Using atomic operations like a.atomic_add(b) ti.atomic_add(a, b) or a += b
Using is and is not inside Taichi kernel and Taichi function Not supported
Ndrange for loop with the number of the loop variables not equal to the dimension of the ndrange Not supported
ti.ui.make_camera() ti.ui.Camera()
ti.ui.Window.write_image() ti.ui.Window.save_image()
ti.SOA ti.Layout.SOA
ti.AOS ti.Layout.AOS
ti.print_profile_info ti.profiler.print_scoped_profiler_info
ti.clear_profile_info ti.profiler.clear_scoped_profiler_info
ti.print_memory_profile_info ti.profiler.print_memory_profiler_info
ti.CuptiMetric ti.profiler.CuptiMetric
ti.get_predefined_cupti_metrics ti.profiler.get_predefined_cupti_metrics
ti.print_kernel_profile_info ti.profiler.print_kernel_profiler_info
ti.query_kernel_profile_info ti.profiler.query_kernel_profiler_info
ti.clear_kernel_profile_info ti.profiler.clear_kernel_profiler_info
ti.kernel_profiler_total_time ti.profiler.get_kernel_profiler_total_time
ti.set_kernel_profiler_toolkit ti.profiler.set_kernel_profiler_toolkit
ti.set_kernel_profile_metrics ti.profiler.set_kernel_profiler_metrics
ti.collect_kernel_profile_metrics ti.profiler.collect_kernel_profiler_metrics
ti.VideoManager ti.tools.VideoManager
ti.PLYWriter ti.tools.PLYWriter
ti.imread ti.tools.imread
ti.imresize ti.tools.imresize
ti.imshow ti.tools.imshow
ti.imwrite ti.tools.imwrite
ti.ext_arr ti.types.ndarray
ti.any_arr ti.types.ndarray
ti.Tape ti.ad.Tape
ti.clear_all_gradients ti.ad.clear_all_gradients
ti.linalg.sparse_matrix_builder ti.types.sparse_matrix_builder
  • We no longer deprecate the builtin min/max function in the Taichi kernel anymore.
  • We deprecate some arguments in the declaration of the arguments of the compute graph, and they will be removed in v1.7.0. Including:
    • element_shape argument for scalar and ndarray
    • shape, channel_format and num_channels arguments for texture
  • cc backend will be removed at next release (v1.7.0)

New features

Struct arguments

You can now use struct arguments in all backends. The structs can be nested, and it can contain matrices and vectors. Here's an example:

transform_type = ti.types.struct(R=ti.math.mat3, T=ti.math.vec3)
pos_type = ti.types.struct(x=ti.math.vec3, trans=transform_type)
@ti.kernel
def kernel_with_nested_struct_arg(p: pos_type) -> ti.math.vec3:
    return p.trans.R @ p.x + p.trans.T
trans = transform_type(ti.math.mat3(1), [1, 1, 1])
p = pos_type(x=[1, 1, 1], trans=trans)
print(kernel_with_nested_struct_arg(p))  # [4., 4., 4.]

Ndarray

  • Support 0 dim ndarray read & write in python scope
  • Fixed a bug when writing into ndarray from Python scope

Improvements

  • Support rsqrt operator in autodiff
  • Added assembly printer for CPU backend Zhanlue Yang
  • Supporting CUDA shared array allocation over 48KiB

Performance

  • Improved vectorization support on CPU backend, with significant performance gains for specific applications

New Examples

  • 2D euler fluid simulation example by Lee-abcde

Misc

  • Python 3.11 support
  • ti.frexp is supported on CUDA, Vulkan, Metal, OpenGL backends.
  • ti.math.popcnt intrinsic by Garry Ling
  • Fixed a memory leak issue during SNodeTree destruction Zhanlue Yang
  • Added validation and improved error report for ti.Field finalization Zhanlue Yang
  • Fixed a memory leak issue with Cuda backend in C-API Zhanlue Yang
  • Added support for formatted printing with str.format() and f-strings Tianyi Liu
  • Changed Python code formatter from yapf to black

Developer Experience

  • build.py script for preparing build & testing environment

Full changelog

Highlights:

  • Bug fixes
    • Fix wrong datatype size when writing to ndarray from Python scope (by Ailing Zhang)
  • CUDA backend
    • Warn driver version if it doesn't support memory pool. (#7912) (by Haidong Lan)
    • Better handling shared array shape check (#7818) (by Haidong Lan)
    • Support large shared memory for CUDA backend (#7452) (by Haidong Lan)
  • Documentation
    • Add doc about struct arguments (#7959) (by Lin Jiang)
    • Fix docstring of mix function (#7922) (by Zhao Liang)
    • Update faq and ggui, and add them to CI (#7861) (by Zhao Liang)
    • Update doc for dynamic snode (#7804) (by Zhao Liang)
    • Update field.md (#7819) (by zhoooou)
    • Update readme (#7808) (by yanqingzhang)
    • Update write_test.md (#7745) (by Qian Bao)
    • Update performance.md (#7720) (by Zhao Liang)
    • Update readme (#7673) (by Zhao Liang)
    • Update tutorial.md (#7512) (by Chenzhan Shang)
    • Update gui_system.md (#7628) (by Qian Bao)
    • Remove deprecated api docstrings (#7596) (by pengyu)
    • Fix the cexp docstring (#7588) (by Zhao Liang)
    • Add doc about returning struct (#7556) (by Lin Jiang)
  • Error messages
    • Update deprecation warning of the graph arguments (#7965) (by Lin Jiang)
  • Language and syntax
    • Remove deprecated funcs in init.py (#7941) (by Lin Jiang)
    • Remove deprecated sparse_matrix_builder function (#7942) (by Lin Jiang)
    • Remove deprecated funcs in ti.ui (#7940) (by Lin Jiang)
    • Remove the support for 'is' (#7930) (by Lin Jiang)
    • Raise error when the dimension of the ndrange does not equal to the number of the loop variable (#7933) (by Lin Jiang)
    • Remove a.atomic(b) (#7925) (by Lin Jiang)
    • Cancel deprecating native min/max (#7928) (by Lin Jiang)
    • Let nested data classes have methods (#7909) (by Lin Jiang)
    • Let kernel argument support matrix nested in a struct (by lin-hitonami)
    • Support the functions of dataclass as kernel argument and return value (#7865) (by Lin Jiang)
    • Fix a bug on PosixPath (#7860) (by Zhao Liang)
    • Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (#7803) (by Zhanlue Yang)
    • Fix pylance warning (#7805) (by Zhao Liang)
    • Support taking structs as kernel arguments (by lin-hitonami)
    • Fix math module circular import bugs (#7762) (by Zhao Liang)
    • Support formatted printing in str.format() and f-strings (#7686) (by 魔法少女赵志辉)
    • Replace internal representation of Python-scope ti.Matrix with numpy arrays (#7559) (by Yi Xu)
    • Stop letting ti.Struct inherit from TaichiOperations (#7474) (by Yi Xu)
    • Support writing sparse matrix as matrix market file (#7529) (by pengyu)
  • Vulkan backend
    • Fix repeated generation of array ranges in spirv codegen. (#7625) (by Haidong Lan)

Full changelog:

  • [CUDA] Warn driver version if it doesn't support memory pool. (#7912) (by Haidong Lan)
  • [Doc] Add doc about struct arguments (#7959) (by Lin Jiang)
  • [Error] Update deprecation warning of the graph arguments (#7965) (by Lin Jiang)
  • [windows] Workaround C++ mangling special chars (#7964) (by Ailing)
  • [Lang] Remove deprecated funcs in init.py (#7941) (by Lin Jiang)
  • [build] Remove redundant C-API shared object in wheel (#7950) (by Proton)
  • [test] Do not test cc backend (by Proton)
  • [Lang] Remove deprecated sparse_matrix_builder function (#7942) (by Lin Jiang)
  • [Lang] Remove deprecated funcs in ti.ui (#7940) (by Lin Jiang)
  • [Lang] Remove the support for 'is' (#7930) (by Lin Jiang)
  • [Lang] Raise error when the dimension of the ndrange does not equal to the number of the loop variable (#7933) (by Lin Jiang)
  • [Lang] Remove a.atomic(b) (#7925) (by Lin Jiang)
  • [Lang] Cancel deprecating native min/max (#7928) (by Lin Jiang)
  • [Doc] Fix docstring of mix function (#7922) (by Zhao Liang)
  • [example] Fix ti example bugs (#7903) (by Zhao Liang)
  • [ci] Build.py: Source generated env in new spawned shell (by Proton)
  • [misc] Fix changelog commit extract code (by Proton)
  • [ci] More robust build.py bootstrapping (#7920) (by Proton)
  • [Lang] [bug] Let nested data classes have methods (#7909) (by Lin Jiang)
  • [cuda] Only set CU_LIMIT_STACK_SIZE when necessary (#7906) (by Ailing)
  • [Lang] Let kernel argument support matrix nested in a struct (by lin-hitonami)
  • [Bug] Fix wrong datatype size when writing to ndarray from Python scope (by Ailing Zhang)
  • [lang] Support 0 dim ndarray read & write in python scope (by Ailing Zhang)
  • [Lang] Support the functions of dataclass as kernel argument and return value (#7865) (by Lin Jiang)
  • [spirv] Support struct as kernel argument (by Lin Jiang)
  • [spirv] Fix the ret type of frexp (by lin-hitonami)
  • [ci] Build.py: Do not try to bootstrap pip (too many issues) (#7897) (by Proton)
  • [ci] Build.py quirks fix (#7894) (by Proton)
  • [Doc] Update faq and ggui, and add them to CI (#7861) (by Zhao Liang)
  • [build] Remove unused apt pkg 'libmirclient-dev' to make 'build.py' run properly on ubuntu 22.04 (#7871) (by Yu Zhang)
  • [Lang] Fix a bug on PosixPath (#7860) (by Zhao Liang)
  • [ci] Polishing build.py, wave 4 (#7857) (by Proton)
  • [build] Use LLVM without zstd dependency on M1 Macs (#7856) (by Proton)
  • [doc] Update dev_install.md to reflect build.py usage (#7848) (by Proton)
  • [ci] Polishing build.py, wave 3 (#7845) (by Proton)
  • [lang] Add popcnt to llvm intrinsic support (#7772) (by Garry Ling)
  • [Doc] Update doc for dynamic snode (#7804) (by Zhao Liang)
  • [ci] Fix release build failure (#7834) (by Proton)
  • [ci] More robust build.py bootstrapping (#7833) (by Proton)
  • [Doc] Update field.md (#7819) (by zhoooou)
  • [autodiff] Remove redundant autodiff mode in kernel name (#7829) (by Ailing)
  • [lang] Migrate Caching Allocation logics from CudaDevice/AmdgpuDevice to DeviceMemoryPool (#7793) (by Zhanlue Yang)
  • [misc] Resolve code formatter frictions (#7828) (by Proton)
  • [Lang] Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (#7803) (by Zhanlue Yang)
  • [bug] Fix imgui_context in destroying multiple GGUI windows (#7812) (by Ailing)
  • [misc] Update git-blame-ignore-revs (#7825) (by Proton)
  • [ci] Complete doc test list, remove redundant default prelude (#7823) (by Proton)
  • [misc] Relax Black formatter line length limit to 120 (#7824) (by Proton)
  • [Doc] Update readme (#7808) (by yanqingzhang)
  • [misc] Switch code formatter from yapf to black (#7785) (by Proton)
  • [CUDA] Better handling shared array shape check (#7818) (by Haidong Lan)
  • [misc] Improve ::liong::json::deserialize() (by PGZXB)
  • [bug] Fix gen_offline_cache_key (#7810) (by PGZXB)
  • [ci] Fix build.py ensurepip (#7811) (by Proton)
  • [Lang] Fix pylance warning (#7805) (by Zhao Liang)
  • [lang] Support frexp on spirv-based backends (#7770) (by Ailing)
  • [lang] Split MemoryPool into DeviceMemoryPool and HostMemoryPool (#7786) (by Zhanlue Yang)
  • [misc] Optimize import overhead: pytorch and get_clangpp (#7797) (by Haidong Lan)
  • [ci] [doc] Tighten up document testing (#7801) (by Proton)
  • [ci] Polishing build.py, wave 2 (#7800) (by Proton)
  • [aot] Remove unused AotDataConverter (#7799) (by Lin Jiang)
  • [perf] Fix Taichi CPU backend compile parameter to pair performance with Numba. (#7731) (by zhengxianli)
  • [ci] Polishing build.py (#7794) (by Proton)
  • [bug] Returning nan for ti.sym_eig on identity matrix (#7443) (by Yimin Tang)
  • [Lang] Support taking structs as kernel arguments (by lin-hitonami)
  • [ir] Add 'create_load' to ArgLoadStmt (by lin-hitonami)
  • [ir] Let the src of GetElementStmt be a pointer (by lin-hitonami)
  • [lang] Clean up runtime allocation functions (#7773) (by Zhanlue Yang)
  • [lang] Migrate CUDA preallocation logic to CudaMemoryPool (#7746) (by Zhanlue Yang)
  • [gfx] Fix runtime buffer/image copy barrier semantics (#7781) (by Bob Cao)
  • [misc] Remove unnecessary TaskCodeGenLLVM::task_counter (#7777) (by PGZXB)
  • [ci] Temporarily force Windows release builds to run on sm70 nodes (#7767) (by Proton)
  • [refactor] Remove Kernel::lowered_ (#7765) (by PGZXB)
  • [gui] Fluid visualization utilities (#7682) (by Qian Bao)
  • [Lang] Fix math module circular import bugs (#7762) (by Zhao Liang)
  • [misc] Make pre-commit happy (#7768) (by Proton)
  • [ci] Build iOS AOT static library (by Proton)
  • [misc] Wrap path with std::filesystem::path (#7754) (by Bob Cao)
  • [lang] Support vector and matrix dtypes in ti.field (#7761) (by Ailing)
  • [ir] Remove unnecessary field_dims_ in ArgLoadStmt (#7755) (by Ailing)
  • [refactor] Remove Kernel::task_counter_ (#7751) (by PGZXB)
  • [ci] Build.py: Introduce TAICHI_CMAKE_ARGS manager for better log readability (by Proton)
  • [ci] Reorganize build.py code (by Proton)
  • [refactor] Let KernelCompilationManager manage kernel compilation in gfx::AotModuleBuilderImpl (#7715) (by PGZXB)
  • [misc] Remove unused FullSimplifyPass::Args::program (#7750) (by PGZXB)
  • [refactor] Re-impl LlvmAotModule using LLVM::KernelLauncher (#7744) (by PGZXB)
  • [lang] Implement experimental CG(Conjugate Gradient) solver in Taichi-lang (#7690) (by Qian Bao)
  • [lang] Transform bit_shr to bit_sar for uint (#7757) (by Ailing)
  • [ir] Postpone scalarize and lower_matrix_ptr to after bit loop vectorization (#7726) (by 魔法少女赵志辉)
  • [ci] Isolate post sm70 tests (#7740) (by Proton)
  • [cuda] Suppport using SparseMatrix on more CUDA versions (#7724) (by Yu Zhang)
  • [cuda] Update the data layout of CUDA (#7748) (by Lin Jiang)
  • [ci] Ignore dup benchmark data points (#7749) (by Proton)
  • [bug] Fix reduction of atomic max (#7747) (by Lin Jiang)
  • [Doc] Update write_test.md (#7745) (by Qian Bao)
  • [refactor] Remove 'args' from 'RuntimeContext' (by lin-hitonami)
  • [gfx] Let gfx backends use LaunchContextBuilder to build arguments in struct type (by lin-hitonami)
  • [gfx] [refactor] Convert f16 in LaunchContextBuilder (by lin-hitonami)
  • [gfx] Record the struct type of arguments and results in KernelContextAttributes (by lin-hitonami)
  • [gfx] Compile struct type of result and arguments in gfx backends (by lin-hitonami)
  • [refactor] Implement CompiledKernelData::check() (#7743) (by PGZXB)
  • [doc] [test] Update docs for printing with f-strings and formatted strings (#7733) (by 魔法少女赵志辉)
  • [lang] Improve error message for mismatched index for ndarrays in python scope (#7737) (by Ailing)
  • [bug] Avoid redundant cache loading (#7741) (by PGZXB)
  • [refactor] Let KernelCompilationManager manage kernel compilation in LlvmAotModuleBuilder (#7714) (by PGZXB)
  • [ci] Skip large shared memory test for Turing GPUs. (#7739) (by Haidong Lan)
  • [cuda] Remove deprecated cusparse functions (#7725) (by Yu Zhang)
  • [misc] Update pull_request_template.md (#7738) (by Ailing)
  • [misc] Remove TI_WARN for cuda in memory_pool.cpp (#7734) (by Ailing)
  • [CUDA] Support large shared memory for CUDA backend (#7452) (by Haidong Lan)
  • [vulkan] Update SPIR-V codegen to emit FP16 consts (#7676) (by Bob Cao)
  • [lang] Support frexp on cuda backend (#7721) (by Ailing)
  • [refactor] Unify implementation of ProgramImpl::compile() (by PGZXB)
  • [refactor] Introduce LLVM::KernelLauncher (by PGZXB)
  • [refactor] Introduce gfx::KernelLauncher (by PGZXB)
  • [test] Enable test offline cache on amdgpu and dx11 (#7703) (by PGZXB)
  • [lang] Refactor ownership and inheritance of allocators (#7685) (by Zhanlue Yang)
  • [ci] Fix git cache quirks (#7722) (by Proton)
  • [lang] Improve error msg in create ndarray (#7709) (by Garry Ling)
  • [Doc] Update performance.md (#7720) (by Zhao Liang)
  • [bug] Switch the gallery image used by README. (#7716) (by Chengchen(Rex) Wang)
  • [lang] Merge AMDGPUCachingAllocator to the generic CachingAllocator (#7717) (by Zhanlue Yang)
  • [bug] Invalid Field cache, RWAccessors cache, and Kernel cache upon SNodeTree destruction (#7704) (by Zhanlue Yang)
  • [ci] [test] Enable cc test on CI (by lin-hitonami)
  • [test] [cc] Skip tests that cc backend doesn't support (by lin-hitonami)
  • [test] Exclude the cc backend from tests that involve dynamic indexing (#7705) (by 魔法少女赵志辉)
  • [bug] Fix camera controls (#7681) (by liblaf)
  • [bug] [cc] Fix comparison op in cc backend (by Lin Jiang)
  • [bug] [cc] Set external ptr for cc backend (by lin-hitonami)
  • [lang] Merged VirtualMemoryAllocator into MemoryPool for LLVM-CPU backend (#7671) (by Zhanlue Yang)
  • [misc] Remove useless JITEvaluatorId (#7700) (by PGZXB)
  • [bug] Fixed building with clang on Windows failed (#7699) (by PGZXB)
  • [Lang] Support formatted printing in str.format() and f-strings (#7686) (by 魔法少女赵志辉)
  • [ci] Git caching proxy in CI (#7692) (by Proton)
  • [build] Let msvc generate pdb for cpp & c_api tests (by lin-hitonami)
  • [refactor] Stop storing pointers to array devallocs in kernel args (by lin-hitonami)
  • [aot] Implement bin2c in AOT cppgen (#7687) (by PENGUINLIONG)
  • [cpu] Remove atomics demotion for single-thread CPU targets. (#7631) (by Haidong Lan)
  • [aot] Export templated kernels (#7683) (by PENGUINLIONG)
  • [ci] Revive /benchmark (#7680) (by Proton)
  • [Doc] Update readme (#7673) (by Zhao Liang)
  • [misc] Device API public headers and CMake rework part 1 (#7624) (by Bob Cao)
  • [misc] Move optimize cpu module to KernelCodeGen (#7667) (by PGZXB)
  • [lang] [ir] Extract and save the format specifiers in str.format() (#7660) (by 魔法少女赵志辉)
  • [example] Add 2D euler fluid simulation example (#7568) (by Lee-abcde)
  • [wasm] Remove WASM backend (by lin-hitonami)
  • [build] Fix ssize_t type undefined errors when building with TI_WITH_LLVM=OFF on windows (#7665) (by Yu Zhang)
  • [misc] Remove unused Kernel::is_evaluator (#7669) (by PGZXB)
  • [misc] Remove unused Program::jit_evaluator_cache and Program::jit_evaluator_cache_mut (#7668) (by PGZXB)
  • [misc] Simplify test_offline_cache.py (#7663) (by PGZXB)
  • [lang] Improve error reporting for FieldsBuilder finalization (#7640) (by Zhanlue Yang)
  • [misc] Rename taichi::lang::llvm to taichi::lang::LLVM (#7659) (by PGZXB)
  • [refactor] Remove MemoryPool daemon in LLVM runtime (#7648) (by Zhanlue Yang)
  • [opt] Cleanup unncessary options in constant fold pass (#7661) (by Ailing)
  • [ci] Use build.py to prepare testing environment on Windows (#7658) (by Proton)
  • [opt] Move binary jit evaluator to host (by Ailing Zhang)
  • [test] Update C++ constant fold tests to test operator one by one (by Ailing Zhang)
  • [aot] Avoid shared library file being packaged into wheel data (#7652) (by Chenzhan Shang)
  • [ci] Fix scipy install (#7649) (by Proton)
  • [misc] Remove an unnecessary parameter of KernelCompilationManager::make_filename (by PGZXB)
  • [refactor] Remove some unnecessary functions of KernelCodeGen (by PGZXB)
  • [refactor] Re-impl JIT and Offline Cache on LLVM backends (by PGZXB)
  • [refactor] Implement llvm::KernelCompiler (by PGZXB)
  • [refactor] Gen code for KernelCodeGen::ir instead of KernelCodeGen::kernel->ir (by PGZXB)
  • [Doc] Update tutorial.md (#7512) (by Chenzhan Shang)
  • [ci] Test manylinux2014 build on PR (#7647) (by Proton)
  • [bug] Fix logical comparison returns -1 (#7641) (by Ailing)
  • [doc] Fix gui_system.md tests (#7646) (by Proton)
  • [Doc] Update gui_system.md (#7628) (by Qian Bao)
  • [aot] Hand-written CMake target script (#7644) (by PENGUINLIONG)
  • [ci] Do not use Android toolchain for perf testing (#7642) (by Proton)
  • [ci] Support Python 3.11 (#7627) (by Proton)
  • [build] Setup Android SDK environment for performance bot (#7635) (by Zhanlue Yang)
  • [ci] Update perf mon image (#7639) (by Proton)
  • [ci] Fix perf mon break (#7638) (by Proton)
  • [doc] Add documentation on using ghstack (#7632) (by Proton)
  • [build] Static linking libstdc++ on Linux (by Proton)
  • [ci] Rewrite Dockerfiles (by Proton)
  • [ci] Resolve "Needed single revision" workaround failure when the repo directory is empty (#7633) (by Proton)
  • [Vulkan] Fix repeated generation of array ranges in spirv codegen. (#7625) (by Haidong Lan)
  • [build] Switch to use docker with Android-SDK for performance bot (#7630) (by Zhanlue Yang)
  • [opengl] glfw finalize crash fix (by Proton)
  • [ci] build.py: Android support, entering shell, export env (by Proton)
  • [ci] Do not run tests with mixed backends (by Proton)
  • [refactor] Use f16 function from external lib (by lin-hitonami)
  • [refactor] Migrate members from RuntimeContext to LaunchContextBuilder (by lin-hitonami)
  • [bug] Fix setting arguments exceeding the max arg num (by lin-hitonami)
  • [cpu] Explicitly make cpu multithreading loop for range-fors. (#7593) (by Haidong Lan)
  • [aot] Fixed generator for compute graph (#7626) (by PENGUINLIONG)
  • [ir] Postpone scalarize and lower_matrix_ptr to after typecheck (#7589) (by 魔法少女赵志辉)
  • [aot] Header generator completed (#7609) (by PENGUINLIONG)
  • [amdgpu] Initialize AMDGPUContext with defaults (by Proton)
  • [build] Remove libSPIRV-Tools-shared.(so|dll) in wheel (by Proton)
  • [lang] Removed cpu_device(), cuda_device(), and amdgpu_device() from LlvmRuntimeExecutor (#7544) (by Zhanlue Yang)
  • [refactor] Remove the get/set functions in RuntimeContext (by lin-hitonami)
  • [aot] Pass LaunchContextBuilder to CompiledGraph::init_runtime_context (by lin-hitonami)
  • [gfx] Let GfxRuntime use LaunchContextBuilder (by lin-hitonami)
  • Let LaunchContextBuilder be the argument of the kernel launch function (by lin-hitonami)
  • [llvm] [refactor] Set the llvm runtime when executing (by lin-hitonami)
  • [refactor] Migrate {set, get}_{arg, ret} functions from RuntimeContext (by lin-hitonami)
  • [bug] Fix compilation error (#7606) (by PGZXB)
  • [aot] Hide map memory failure (#7604) (by PENGUINLIONG)
  • [refactor] Fix KernelCodeGen::kernel from Kernel * to const Kernel * (by PGZXB)
  • [refactor] Remove legacy implementation of llvm offline cache (by PGZXB)
  • [refactor] Impl llvm::CompiledKernelData (by PGZXB)
  • [bug] Type check for logical not op with real type inputs (#7600) (by Ailing)
  • [bug] Improve ndarray creation to fix segmentation fault (#7577) (by pengyu)
  • [lang] Add assembly printer for CPU backend (#7590) (by Zhanlue Yang)
  • [misc] Update docker filer (#7598) (by Zeyu Li)
  • [aot] Fix absolute path in generated TaichiTargets.cmake (#7597) (by Chenzhan Shang)
  • [Doc] Remove deprecated api docstrings (#7596) (by pengyu)
  • [llvm] Compile the kernel arguments to a StructType (by Lin Jiang)
  • [lang] Fix issue with llvm opaque pointer (#7557) (by Zhanlue Yang)
  • [opt] Constant folding for unary ops on host (#7573) (by Ailing)
  • [bug] Type check for bit_not op with real type inputs (#7592) (by Ailing)
  • [Doc] Fix the cexp docstring (#7588) (by Zhao Liang)
  • [Lang] Replace internal representation of Python-scope ti.Matrix with numpy arrays (#7559) (by Yi Xu)
  • [bug] Avoid cuda compilation via clang and ship pre-compiled .bc file instead (#7570) (by Zhanlue Yang)
  • [aot] Taichi kernel AOT command (#7565) (by PENGUINLIONG)
  • [bug] Fix struct members registered to StructField class (#7574) (by Ailing)
  • [aot] Mobile platform AOT build scripts (#7567) (by PENGUINLIONG)
  • [misc] Revert "Security upgrade ipython from 7.34.0 to 8.10.0 (#7341)" (#7571) (by Proton)
  • [test] Add cpp tests for constant folding pass (#7566) (by Ailing)
  • [misc] Security upgrade ipython from 7.34.0 to 8.10.0 (#7341) (by Chengchen(Rex) Wang)
  • [lang] Refactor CudaCachingAllocator into a more generic caching allocator (#7531) (by Zhanlue Yang)
  • [aot] Load GfxRuntime140 module from TCM (#7539) (by PENGUINLIONG)
  • [lang] Fixed useless serial shader to blit ExternalTensorShapeAlongAxisStmt on Metal (#7562) (by PENGUINLIONG)
  • [aot] Enable Vulkan 8bit storage (#7564) (by PENGUINLIONG)
  • [bug] Fix crashing on printing FrontendFuncCallStmt with no return value (by lin-hitonami)
  • [refactor] Remove LaunchContextBuilder::set_arg_raw (by lin-hitonami)
  • [llvm] Generalize TaskCodeGenLLVM::create_return to set_struct_to_buffer (by lin-hitonami)
  • [bug] Fix Cuda memory leak during TiRuntime destruction (#7345) (by Zhanlue Yang)
  • [ir] Let void struct type represent void type (by lin-hitonami)
  • [aot] Let C-API use LaunchContextBuilder to manage RuntimeContext (by lin-hitonami)
  • [ir] Let the reference type declare a pointer argument (by lin-hitonami)
  • [Doc] Add doc about returning struct (#7556) (by Lin Jiang)
  • [bug] Fix returning struct containing vec3 (#7552) (by Lin Jiang)
  • [lang] [ir] Extract and save the format specifiers in the f-string (#7514) (by 魔法少女赵志辉)
  • [Lang] Stop letting ti.Struct inherit from TaichiOperations (#7474) (by Yi Xu)
  • [aot] Recover AOT CI branch names (#7543) (by PENGUINLIONG)
  • [aot] Put TiRT in Python wheel and CMake script to find it in wheel (#7537) (by PENGUINLIONG)
  • [refactor] Remove the difficult-to-implement CompiledKernelData::size() (#7540) (by PGZXB)
  • [bug] Implement the missing clone function for FrontendFuncCallStmt (#7538) (by PGZXB)
  • [misc] Bump version to v1.6.0 (#7536) (by Haidong Lan)
  • [doc] Handle 2 digit minor versions correctly (#7535) (by Ritoban Roy-Chowdhury)
  • [aot] GfxRuntime140 convention docs (#7527) (by PENGUINLIONG)
  • [rhi] Refactor allocate_memory API to use RhiResult (#7463) (by Bob Cao)
  • [metal] Choose the proper msl version according to the device capability (#7506) (by Yu Zhang)
  • [Lang] Support writing sparse matrix as matrix market file (#7529) (by pengyu)