Skip to content

v1.5.0

Compare
Choose a tag to compare
@github-actions github-actions released this 27 Mar 16:35
· 595 commits to master since this release
7b885c2

Deprecation Notice

  • ndarray no longer accepts field_dim, replaced by the ndim argument.
  • [RFC] Deprecate ti.cc backend in favor of TiRT and its C API, if you have any concerns please let us know at #7629

New features

AOT

  • Taichi Runtime (TiRT) now supports Apple's Metal API and OpenGL ES for compatibility on old mobile platforms. Now Taichi programs can be deployed to any mainstream consumer devices.
    NOTE Taichi program deployment on mobile platforms is experimental. Please contact us at [email protected] for long-term services.
  • Taichi AOT now fully supports float16 dtype.

Ndarray

  • Out of bound check is now supported on ndarrays

Improvements

Python Frontend

We now support returning a struct on LLVM-based backends (CPU and CUDA backend). The struct can contain vectors and matrices, and it can also nest with other structs. Here's an example.

s0 = ti.types.struct(a=ti.math.vec3, b=ti.i16)
s1 = ti.types.struct(a=ti.f32, b=s0)

@ti.kernel
def foo() -> s1:
    return s1(a=1, b=s0(a=ti.math.vec3(100, 0.2, 3), b=1))

print(foo())  # {'a': 1.0, 'b': {'a': [100.0, 0.2, 3.0], 'b': 1}}

Performance

  • Support atomic operation on half2 for CUDA backend (with compute capability > 60). You can enable this with ti.init(half2_vectorization=True). This feature could effectively accelerate the Nerf training process, please refer to this repo for details.

GGUI

  • GGUI now has no computing backend restrictions! You can now use Metal, OpenGL, AMDGPU, or DirectX 11, in addition to CPU, CUDA, Vulklan that's previously suported by GGUI.
  • GGUI now has been validated on mesa's software rasterizer lavapipe, you can utilize this solution for headless server visualization, or on servers with no graphics capabilities (such as A100)
  • Add the fps_limit option which adjusts the maximal frame rate in GGUI.

Full changelog:

Highlights:
   - **AMDGPU backend**
      - Enable shared array on amdgpu backend (#7403) (by **Zeyu Li**)
      - Add print kernel amdgcn (#7357) (by **Zeyu Li**)
      - Add amdgpu backend profiler (#7330) (by **Zeyu Li**)
   - **Aot module**
      - Let AOT kernel inherit CallableBase and use LaunchContextBuilder (by **lin-hitonami**)
      - Deprecate element shape and field dim for AOT symbolic args (#7100) (by **Haidong Lan**)
   - **Bug fixes**
      - Fix copy_from() of StructField (#7294) (by **Yi Xu**)
      - Fix caching same loop invariant global vars inside nested fors (#7285) (by **Lin Jiang**)
      - Fix num_splits in parallel_struct_for (#7121) (by **Yi Xu**)
      - Fix ret_type and cast_type of UnaryOpStmt in Scalarize (#7082) (by **Yi Xu**)
   - **Documentation**
      - Update GGUI docs with correct API (#7525) (by **pengyu**)
      - Fix typos and improve example code in data_oriented_class.md (#7520) (by **pengyu**)
      - Update gui_system.md, remove unnecessary example (#7487) (by **NextoneX**)
      - Fix typo in API doc (#7511) (by **pengyu**)
      - Update math_module (#7405) (by **Zhao Liang**)
      - Update hello_world.md (#7400) (by **Zhao Liang**)
      - Update debugging.md (#7401) (by **Zhao Liang**)
      - Update hello_world.md (#7380) (by **Zhao Liang**)
      - Update type.md (#7376) (by **Zhao Liang**)
      - Update kernel_function.md (#7375) (by **Zhao Liang**)
      - Update hello_world.md (#7369) (by **Zhao Liang**)
      - Update hello_world.md (#7368) (by **Zhao Liang**)
      - Update data_oriented_class.md (#6790) (by **Zhao Liang**)
      - Update hello_world.md (#7367) (by **Zhao Liang**)
      - Update kernel_function.md (#7364) (by **Zhao Liang**)
      - Update hello_world.md (#7354) (by **Zhao Liang**)
      - Update llvm_sparse_runtime.md (#7323) (by **Gabriel Vainer**)
      - Update profiler.md (#7358) (by **Zhao Liang**)
      - Update kernel_function.md (#7356) (by **Zhao Liang**)
      - Update tut.md (#7352) (by **Gabriel Vainer**)
      - Update type.md (#7350) (by **Zhao Liang**)
      - Update hello_world.md (#7337) (by **Zhao Liang**)
      - Update append docstring (#7265) (by **Zhao Liang**)
      - Update ndarray.md (#7236) (by **Gabriel Vainer**)
      - Update llvm_sparse_runtime.md (#7215) (by **Zhao Liang**)
      - Remove doc tutorial (#7198) (by **Olinaaaloompa**)
      - Rename tutorial doc (#7186) (by **Zhao Liang**)
      - Update tutorial.md (#7176) (by **Zhao Liang**)
      - Update math_module.md (#7175) (by **Zhao Liang**)
      - Update debugging.md (#7173) (by **Zhao Liang**)
      - Fix C++ tutorial does not display on doc site (#7174) (by **Zhao Liang**)
      - Update doc regarding dynamic index (#7148) (by **Yi Xu**)
      - Move glossary to top level (#7118) (by **Zhao Liang**)
      - Update type.md (#7038) (by **Zhao Liang**)
      - Fix docstring (#7065) (by **Zhao Liang**)
   - **Error messages**
      - Allow IfExp on matrices when the condition is scalar (#7241) (by **Lin Jiang**)
      - Remove deprecations in ti.ui in 1.6.0 (#7229) (by **Lin Jiang**)
      - Remove deprecated ti.linalg.sparse_matrix_builder in 1.6.0 (#7228) (by **Lin Jiang**)
      - Remove deprecations in ASTTransformer in 1.6.0 (#7226) (by **Lin Jiang**)
      - Remove deprecated a.atomic_op(b) in Taichi v1.6.0 (#7225) (by **Lin Jiang**)
      - Remove deprecations in taichi/__init__.py in v1.6.0 (#7222) (by **Lin Jiang**)
      - Raise error when using deprecated ifexp on matrices (#7224) (by **Lin Jiang**)
      - Better error message when creating sparse snodes on backends that do not support sparse (#7191) (by **Lin Jiang**)
      - Raise errors when using metal sparse (#7113) (by **Lin Jiang**)
   - **GUI**
      - GGUI use shader "factory" (GGUI rework n/N) (#7271) (by **Bob Cao**)
   - **Intermediate representation**
      - Unified type system for internal operations (#6337) (by **daylily**)
   - **Language and syntax**
      - Keep ti.pyfunc (#7530) (by **Lin Jiang**)
      - Type check assignments between tensors (#7480) (by **Yi Xu**)
      - Fix pylance warnings raised by ti.static (#7437) (by **Zhao Liang**)
      - Deprecate arithmetic operations and fill() on ti.Struct (#7456) (by **Yi Xu**)
      - Fix pylance warnnings by ti.random (#7439) (by **Zhao Liang**)
      - Fix pylance types warning (#7417) (by **Zhao Liang**)
      - Add better error message for dynamic snode (#7238) (by **Zhao Liang**)
      - Simplify the swizzle generator (#7216) (by **Zhao Liang**)
      - Remove the deprecated dynamic_index switch (#7195) (by **Yi Xu**)
      - Remove deprecated packed switch (#7104) (by **Yi Xu**)
      - Raise errors when using the packed switch (#7125) (by **Yi Xu**)
      - Fix cannot use taichi in REPL (#7114) (by **Zhao Liang**)
      - Remove deprecated ti.Matrix.rotation2d() (#7098) (by **Yi Xu**)
      - Remove filename kwarg in aot Module save() (#7085) (by **Ailing**)
      - Remove sourceinspect deprecation warning message (#7081) (by **Zhao Liang**)
      - Make slicing a single row/column of a matrix return a vector (#7068) (by **Yi Xu**)
   - **Miscellaneous**
      - Strictly check ndim with external array (#7126) (by **Haidong Lan**)

Full changelog:
   - [cc] Add deprecation notice for cc backend (#7651) (by **Ailing**)
   - [misc] Cherry pick struct return related commits (#7575) (by **Haidong Lan**)
   - [Lang] Keep ti.pyfunc (#7530) (by **Lin Jiang**)
   - [bug] Fix symbol conflicts with taichi_cpp_tests (#7528) (by **Zhanlue Yang**)
   - [bug] Fix numerical issue with TensorType'd arithmetics (#7526) (by **Zhanlue Yang**)
   - [aot] Enable Metal AOT test (#7461) (by **PENGUINLIONG**)
   - [Doc] Update GGUI docs with correct API (#7525) (by **pengyu**)
   - [misc] Implement KernelCompialtionManager::clean_offline_cache (#7515) (by **PGZXB**)
   - [ir] Except shared array from demote atomics pass. (#7513) (by **Haidong Lan**)
   - [bug] Fix error with windows-clang compilation for cuda_runtime.cu (#7519) (by **Zhanlue Yang**)
   - [misc] Deprecate field dim and update deprecation warnings (#7491) (by **Haidong Lan**)
   - [build] Fix build failure without nvcc (#7521) (by **Ailing**)
   - [Doc] Fix typos and improve example code in data_oriented_class.md (#7520) (by **pengyu**)
   - [aot] Kernel argument count limit (#7518) (by **PENGUINLIONG**)
   - [Doc] Update gui_system.md, remove unnecessary example (#7487) (by **NextoneX**)
   - [AOT] [llvm] Let AOT kernel inherit CallableBase and use LaunchContextBuilder (by **lin-hitonami**)
   - [llvm] Let the offline cache record the type info of arguments and return values (by **lin-hitonami**)
   - [ir] Separate LaunchContextBuilder from Kernel (by **lin-hitonami**)
   - [Doc] Fix typo in API doc (#7511) (by **pengyu**)
   - [aot] Build Runtime C-API by default (#7508) (by **PENGUINLIONG**)
   - [bug] Fix run_tests.py --with-offline-cache (#7507) (by **PGZXB**)
   - [vulkan] Support printing constant strings containing % (#7499) (by **魔法少女赵志辉**)
   - [ci] Fix nightly version number, 2nd try (#7501) (by **Proton**)
   - [aot] Fixed memory leak in metal backend (#7500) (by **PENGUINLIONG**)
   - [ci] Fix nightly version number issue (#7498) (by **Proton**)
   - [example] Remove cv2, cairo dependency (#7496) (by **Zhao Liang**)
   - [type] Let Type * be serializable (by **lin-hitonami**)
   - [ci] Second attempt at permission check for ghstack landing (#7490) (by **Proton**)
   - [docs] Reword words of warning about building from source (#7488) (by **Anselm Schüler**)
   - [lang] Fixed double release of Metal command buffer (#7484) (by **PENGUINLIONG**)
   - [ci] Switch Android bots lock redis to bot-master (#7482) (by **Proton**)
   - [ci] Status check of ghstack CI bot (#7479) (by **Proton**)
   - [Lang] Type check assignments between tensors (#7480) (by **Yi Xu**)
   - [doc] Fix typo in ndarray.md (#7476) (by **Chenzhan Shang**)
   - [opt] Enable half2 optimization for atomic_add operations on CUDA backend (#7465) (by **Zhanlue Yang**)
   - [Lang] Fix pylance warnings raised by ti.static (#7437) (by **Zhao Liang**)
   - Let the LaunchContextBuilder manage the result buffer (by **lin-hitonami**)
   - [ci] Fix nightly build failure, and minor improvements (#7475) (by **Proton**)
   - [ci] Fix duplicated names in aot tests (#7471) (by **Ailing**)
   - [lang] Improve float16 support from Taichi type system (#7402) (by **Zhanlue Yang**)
   - [Lang] Deprecate arithmetic operations and fill() on ti.Struct (#7456) (by **Yi Xu**)
   - [misc] Add out of bound check for ndarray (#7458) (by **Ailing**)
   - [aot] Remove graph kernel interfaces (#7466) (by **PENGUINLIONG**)
   - [llvm] Let the RuntimeContext use the host result buffer (by **lin-hitonami**)
   - [gui] Fix 3d line drawing & add test (#7454) (by **Bob Cao**)
   - [lang] Fixed texture assertions (#7450) (by **PENGUINLIONG**)
   - [aot] Fixed header generator (#7455) (by **PENGUINLIONG**)
   - [aot] AOT module convention GfxRuntime140 (#7440) (by **PENGUINLIONG**)
   - [misc] Add an explicit error in cc backend codegen for dynamic indexing (#7449) (by **Ailing**)
   - [ci] Lower C++ tests concurrency (#7451) (by **Proton**)
   - [aot] Properly handle texture attributes (#7433) (by **PENGUINLIONG**)
   - [Lang] Fix pylance warnnings by ti.random (#7439) (by **Zhao Liang**)
   - [ir] Get the StructType of the kernel parameters (by **lin-hitonami**)
   - [ci] Report failure (not throwing exception) when C++ tests fail (#7435) (by **Proton**)
   - [llvm] Allocate the result buffer from preallocated memory (by **lin-hitonami**)
   - [vulkan] Fix GGUI and vulkan swapchain on AMD drivers (#7382) (by **Bob Cao**)
   - [autodiff] Handle return statement (#7389) (by **Mingrui Zhang**)
   - [misc] Remove unnecessary functions of gfx::AotModuleBuilderImpl (#7425) (by **PGZXB**)
   - [bug] Fix offline_cache::clean_offline_cache_files (ti cache clean) (#7426) (by **PGZXB**)
   - [test] Refactor C++ tests runner (#7421) (by **Proton**)
   - [ci] Adjust perfmon GPU freq (#7429) (by **Proton**)
   - [misc] Remove AotModuleParams::enable_lazy_loading (#7424) (by **PGZXB**)
   - [aot] Use graphs.json instead of TCB (#7392) (by **PENGUINLIONG**)
   - [refactor] Introduce KernelCompilationManager (#7409) (by **PGZXB**)
   - [IR] Unified type system for internal operations (#6337) (by **daylily**)
   - [lang] Add is_lvalue() to Expr to check writeback_binary operand (#7414) (by **魔法少女赵志辉**)
   - [bug] Fix get_error_string ret type typo (#7418) (by **Zeyu Li**)
   - [aot] Reorganize graph argument creation process (#7412) (by **PENGUINLIONG**)
   - [Amdgpu] Enable shared array on amdgpu backend (#7403) (by **Zeyu Li**)
   - [Lang] Fix pylance types warning (#7417) (by **Zhao Liang**)
   - [aot] Simplify device capability assignment (#7407) (by **PENGUINLIONG**)
   - [Doc] Update math_module (#7405) (by **Zhao Liang**)
   - [ci] Lock GPU frequency in perf benchmarking (#7413) (by **Proton**)
   - [ci] Add 'Needed single revision' workaround to all tasks (#7408) (by **Proton**)
   - [Doc] Update hello_world.md (#7400) (by **Zhao Liang**)
   - [refactor] Introduce KernelCompiler and implement spirv::KernelCompiler (#7371) (by **PGZXB**)
   - [Amdgpu] Add print kernel amdgcn (#7357) (by **Zeyu Li**)
   - [Doc] Update debugging.md (#7401) (by **Zhao Liang**)
   - [refactor] Disable ASTSerializer::allow_undefined_visitor (#7391) (by **PGZXB**)
   - [amdgpu] Enable llvm FpOpFusion option on AMDGPU backend (#7398) (by **Zeyu Li**)
   - [aot] Add test for shared array (#7387) (by **Ailing**)
   - [vulkan] Change command list submit error message & misc device API cleanups (#7395) (by **Bob Cao**)
   - [bug] Fix arch_uses_spirv (#7399) (by **PGZXB**)
   - [gui] Fix ggui & vulkan swapchain sizes on HiDPI displays (#7394) (by **Bob Cao**)
   - [Doc] Update hello_world.md (#7380) (by **Zhao Liang**)
   - [aot] Remove support for depth24stencil8 format on Metal (#7377) (by **PENGUINLIONG**)
   - [bug] Add DeviceCapabilityConfig to offline cache key (#7384) (by **PGZXB**)
   - [Doc] Update type.md (#7376) (by **Zhao Liang**)
   - [refactor] Remove dependencies on Callable::program in cpp tests (#7373) (by **PGZXB**)
   - [lang] Experimental support of conjugate gradient solver (#7035) (by **pengyu**)
   - [aot] Metal interop APIs (#7366) (by **PENGUINLIONG**)
   - [Doc] Update kernel_function.md (#7375) (by **Zhao Liang**)
   - [gui] Add `fps_limit` for GGUI (#7374) (by **Bob Cao**)
   - [Doc] Update hello_world.md (#7369) (by **Zhao Liang**)
   - [aot] Fix blockers in static library build with XCode (#7365) (by **PENGUINLIONG**)
   - [vulkan] Remove GLFW from Vulkan rhi dependency (#7351) (by **Bob Cao**)
   - [misc] Remove useless semicolon in llvm_program.h (#7372) (by **PGZXB**)
   - [Doc] Update hello_world.md (#7368) (by **Zhao Liang**)
   - [Amdgpu] Add amdgpu backend profiler (#7330) (by **Zeyu Li**)
   - [lang] Stop broadcasting scalar cond in select statements (#7344) (by **魔法少女赵志辉**)
   - [bug] Fix validation erros due to inactive VK_KHR_16bit_storage (#7360) (by **Zhanlue Yang**)
   - [aot] Support texture in Metal (#7363) (by **PENGUINLIONG**)
   - [Doc] Update data_oriented_class.md (#6790) (by **Zhao Liang**)
   - [Doc] Update hello_world.md (#7367) (by **Zhao Liang**)
   - [refactor] Introduce lang::CompiledKernelData (#7340) (by **PGZXB**)
   - [bug] Fix matrix initialization error with numpy.floating data (#7362) (by **Zhanlue Yang**)
   - [Doc] Update kernel_function.md (#7364) (by **Zhao Liang**)
   - [test] [amdgpu] Fix bug with allocs bb in function body (#7308) (by **Zeyu Li**)
   - [Doc] Update hello_world.md (#7354) (by **Zhao Liang**)
   - [aot] Fixed C-API docs (#7361) (by **PENGUINLIONG**)
   - [refactor] Remove dependencies on Callable::program in lang::CompiledGraph::run (#7288) (by **PGZXB**)
   - [DOC] Update llvm_sparse_runtime.md (#7323) (by **Gabriel Vainer**)
   - [Doc] Update profiler.md (#7358) (by **Zhao Liang**)
   - [Doc] Update kernel_function.md (#7356) (by **Zhao Liang**)
   - [aot] Improve Taichi C++ wrapper implementation (#7347) (by **PENGUINLIONG**)
   - [Doc] Update tut.md (#7352) (by **Gabriel Vainer**)
   - [ci] Add doc snippet CI requirements (#7355) (by **Proton**)
   - [amdgpu] Update device memory free (#7346) (by **Zeyu Li**)
   - [Doc] Update type.md (#7350) (by **Zhao Liang**)
   - [aot] Enable 16-bit dtype support for Taichi AOT (#7315) (by **Zhanlue Yang**)
   - [example] Re-implement the Cornell Box demo with shorter lines of code (#7252) (by **HK-SHAO**)
   - [aot] AOT CI refactorization (#7339) (by **PENGUINLIONG**)
   - [llvm] Let the kernel return struct (by **lin-hitonami**)
   - [Doc] Update hello_world.md (#7337) (by **Zhao Liang**)
   - [ci] Reduce doc test concurrency (#7336) (by **Proton**)
   - [ir] Refactor result fetching (by **lin-hitonami**)
   - [ir] Get the offsets of elements in StructType (by **lin-hitonami**)
   - [misc] Delete test.py (#7332) (by **Bob Cao**)
   - [vulkan] More subgroup operations (#7328) (by **Bob Cao**)
   - [vulkan] Add vulkan profiler (#7295) (by **Haidong Lan**)
   - [refactor] Move TaichiLLVMContext::runtime_jit_module and TaichiLLVMContext::create_jit_module() to LlvmRuntimeExecutor (#7320) (by **PGZXB**)
   - [refactor] Remove dependencies on LlvmProgramImpl::get_llvm_context() in TaskCodeGenLLVM (#7321) (by **PGZXB**)
   - [ci] Checkout with privileged token when landing ghstack PRs (#7331) (by **Proton**)
   - [ir] Add fields to StructType (by **lin-hitonami**)
   - [gui] Remove renderable reuse & make renderable immediate (#7327) (by **Bob Cao**)
   - [Gui] GGUI use shader "factory" (GGUI rework n/N) (#7271) (by **Bob Cao**)
   - [bug] Fix u64 field cannot be assigned value >= 2 ** 63 (#7319) (by **Lin Jiang**)
   - [type] Let the compute type of quant uint be unsigned int (by **lin-hitonami**)
   - [doc] Replace slack with discord (#7318) (by **yanqingzhang**)
   - [refactor] Change print statement to warnings.warn in taichi.lang.util.warning (#7301) (by **Jett Chen**)
   - [ci] ChatOps: ghstack land (#7314) (by **Proton**)
   - [refactor] Remove TaichiLLVMContext::lookup_function_pointer() (#7312) (by **PGZXB**)
   - [misc] Update MSVC flags (#7254) (by **Bob Cao**)
   - [doc] [ci] Cover code snippets in docs (#7309) (by **Proton**)
   - [refactor] Remove dependencies on LlvmProgramImpl::get_llvm_context() in KernelCodeGen (#7289) (by **PGZXB**)
   - [rhi] Device upload readback functions (#7278) (by **Bob Cao**)
   - [aot] Fixed external project inclusion (#7297) (by **PENGUINLIONG**)
   - [Doc] Update append docstring (#7265) (by **Zhao Liang**)
   - [refactor] Remove dependencies on Callable::program in lang::get_hashed_offline_cache_key (#7287) (by **PGZXB**)
   - [ci] [amdgpu] Enable amdgpu backend python unit tests (#7293) (by **Zeyu Li**)
   - [Bug] Fix copy_from() of StructField (#7294) (by **Yi Xu**)
   - [ci] Adapt new Android phone behavior (#7306) (by **Proton**)
   - [Bug] Fix caching same loop invariant global vars inside nested fors (#7285) (by **Lin Jiang**)
   - [amdgpu] Part5 enable the api of amdgpu (#7202) (by **Zeyu Li**)
   - [amdgpu] Enable struct for on amdgpu backend (#7247) (by **Zeyu Li**)
   - [misc] Update external/asset which was accidentally downgraded in #7248 (#7284) (by **Lin Jiang**)
   - [amdgpu] Update runtime module (#7248) (by **Zeyu Li**)
   - [llvm] Remove unused argument 'arch' in LlvmProgramImpl::get_llvm_context (#7282) (by **Lin Jiang**)
   - [misc] Remove deprecated kwarg in rw_texture type annotations (#7267) (by **Ailing**)
   - [ci] Tolerate duplicates when registering version (#7281) (by **Proton**)
   - [gui] Fix GGUI destruction order (#7279) (by **Bob Cao**)
   - [doc] Rename /doc/ndarray_android to /doc/tutorial (#7273) (by **Lin Jiang**)
   - [llvm] Unify the llvm context of host and device (#7249) (by **Lin Jiang**)
   - [misc] Fix manylinux2014 warning not printing (#7270) (by **Proton**)
   - [ci] Building: add complete PATH set for conda (#7268) (by **Proton**)
   - [autodiff] Support rsqrt operator (#7259) (by **Mingrui Zhang**)
   - [ci] Update pre-commit repos version (#7257) (by **Proton**)
   - [refactor] Fix "const CompileConfig *" to "const CompileConfig &" (Part2) (#7253) (by **PGZXB**)
   - [refactor] Fix "const CompileConfig *" to "const CompileConfig &" (#7243) (by **PGZXB**)
   - [aot] Added third-party render thread task injection for Unity (#7151) (by **PENGUINLIONG**)
   - [aot] Support statically linked C-API library on MacOS (#7207) (by **Zhanlue Yang**)
   - [gui] Force GGUI to go through host memory (nuking interops) (#7218) (by **Bob Cao**)
   - [Error] Allow IfExp on matrices when the condition is scalar (#7241) (by **Lin Jiang**)
   - [bug] Fix the parity of the RNG (#7239) (by **Lin Jiang**)
   - [Lang] Add better error message for dynamic snode (#7238) (by **Zhao Liang**)
   - [DOC] Update ndarray.md (#7236) (by **Gabriel Vainer**)
   - [Error] Remove deprecations in ti.ui in 1.6.0 (#7229) (by **Lin Jiang**)
   - [Doc] Update llvm_sparse_runtime.md (#7215) (by **Zhao Liang**)
   - [lang] Add validation checks for subscripts to reject negative indices (#7212) (by **Zhanlue Yang**)
   - [refactor] Remove legacy num_bits and acc_offsets from AxisExtractor (#7227) (by **Yi Xu**)
   - [Error] Remove deprecated ti.linalg.sparse_matrix_builder in 1.6.0 (#7228) (by **Lin Jiang**)
   - [Error] Remove deprecations in ASTTransformer in 1.6.0 (#7226) (by **Lin Jiang**)
   - [misc] Export DeviceAllocation into Python & support devalloc in field_info (#7233) (by **Bob Cao**)
   - [gui] Use templated bulk copy to simplify VBO preperation (#7234) (by **Bob Cao**)
   - [rhi] Add create_image_unique stub & misc RHI bug fixes (#7232) (by **Bob Cao**)
   - [opengl] Fix GLFW global context issue (#7230) (by **Bob Cao**)
   - [examples] Remove dependency on `ti.u8` compute type for ngp (#7220) (by **Bob Cao**)
   - [refactor] Remove Kernel::offload_to_executable (#7210) (by **PGZXB**)
   - [opengl] RW image binding & FP16 support (#7219) (by **Bob Cao**)
   - [Error] Remove deprecated a.atomic_op(b) in Taichi v1.6.0 (#7225) (by **Lin Jiang**)
   - [Error] Remove deprecations in taichi/__init__.py in v1.6.0 (#7222) (by **Lin Jiang**)
   - [Error] Raise error when using deprecated ifexp on matrices (#7224) (by **Lin Jiang**)
   - [refactor] Remove legacy BitExtractStmt (#7221) (by **Yi Xu**)
   - [amdgpu] Part4 link bitcode file (#7180) (by **Zeyu Li**)
   - [example] Reorganize example oit_renderer (#7208) (by **Lin Jiang**)
   - [aot] Fix ndarray aot with information from type hints (#7214) (by **Ailing**)
   - [gui] Fix wide line support on macOS (#7205) (by **Bob Cao**)
   - [Lang] Simplify the swizzle generator (#7216) (by **Zhao Liang**)
   - [refactor] Split constructing and compilation of lang::Function (#7209) (by **PGZXB**)
   - [doc] Fix netlify build command (#7217) (by **Ailing**)
   - [ci] M1 buildbot release tag (#7213) (by **Proton**)
   - [misc] Remove unused task_funcs (#7211) (by **PGZXB**)
   - [refactor] Program::this_thread_config() -> Program::compile_config() (#7199) (by **PGZXB**)
   - [doc] Fix format issues of windows debugging (#7197) (by **Olinaaaloompa**)
   - [aot] More OpenGL interop in C-API (#7204) (by **PENGUINLIONG**)
   - [metal] Disable a kernel test in offline cache to unblock CI (#7154) (by **Ailing**)
   - [ci] Switch Windows build script to build.py (#6993) (by **Proton**)
   - [misc] Update submodule taichi_assets (#7203) (by **Lin Jiang**)
   - [mac] Use ObjectLinkingLayer instead of RTDyldObjectLinkingLayer for aarch64 mac (#7201) (by **Ailing**)
   - [misc] Remove unused Program::jit_evaluator_id (#7200) (by **PGZXB**)
   - [misc] Remove legacy latex generation (#7196) (by **Yi Xu**)
   - [Lang] Remove the deprecated dynamic_index switch (#7195) (by **Yi Xu**)
   - [bug] Fix check_matched() failure with Ndarray holding TensorType'd element (#7178) (by **Zhanlue Yang**)
   - [Doc] Remove doc tutorial (#7198) (by **Olinaaaloompa**)
   - [bug] Fix example circle-packing (#7194) (by **Lin Jiang**)
   - [aot] C-API opengl runtime interop (#7120) (by **damnkk**)
   - [Error] Better error message when creating sparse snodes on backends that do not support sparse (#7191) (by **Lin Jiang**)
   - [example] Fix ti gallery close warning (#7187) (by **Zhao Liang**)
   - [lang] Interface refactors for MatrixType and VectorType (#7143) (by **Zhanlue Yang**)
   - [aot] Find Taichi in python wheel (#7181) (by **PENGUINLIONG**)
   - [gui] Update circles rendering to use quads (#7163) (by **Bob Cao**)
   - [Doc] Rename tutorial doc (#7186) (by **Zhao Liang**)
   - [ir] Fix gcc cannot compile inline template specialization (#7179) (by **Lin Jiang**)
   - [Doc] Update tutorial.md (#7176) (by **Zhao Liang**)
   - [aot] Replace std::exchange with local implementation for C++11 (#7170) (by **PENGUINLIONG**)
   - [ci] Fix near cache urls (missing comma) (#7158) (by **Proton**)
   - [docs] Create windows_debug.md (#7164) (by **Bob Cao**)
   - [Doc] Update math_module.md (#7175) (by **Zhao Liang**)
   - [aot] FindTaichi CMake module to help outside project integration (#7168) (by **PENGUINLIONG**)
   - [aot] Removed unused archs in C-API (#7167) (by **PENGUINLIONG**)
   - [Doc] Update debugging.md (#7173) (by **Zhao Liang**)
   - [refactor] Remove dependencies on Program::this_thread_config() in irpass::constant_fold (#7159) (by **PGZXB**)
   - [Doc] Fix C++ tutorial does not display on doc site (#7174) (by **Zhao Liang**)
   - [aot] C++ wrapper for memory slice and memory allocation with host access (#7171) (by **PENGUINLIONG**)
   - [aot] Fixed ti_get_last_error signature (#7165) (by **PENGUINLIONG**)
   - [misc] Log to stderr instead of stdout (#7166) (by **PENGUINLIONG**)
   - [aot] C-API get version wrapper (#7169) (by **PENGUINLIONG**)
   - [doc] Fix spelling of "paticle_field" (#7024) (by **Xiang (Kevin) Li**)
   - [misc] Remove useless Program::sync (#7160) (by **PGZXB**)
   - [doc] Update accelerate_python.md to use ti.max (#7161) (by **Tao Jin**)
   - [doc] Add doc ndarray (#7157) (by **Olinaaaloompa**)
   - [mac] Add .dylib and .cmake to built wheel (#7156) (by **Ailing**)
   - [refactor] Remove dependencies on Program::this_thread_config() in some tests (#7155) (by **PGZXB**)
   - [refactor] Remove dependencies on Program::this_thread_config() in llvm backends codegen (#7153) (by **PGZXB**)
   - [Lang] Remove deprecated packed switch (#7104) (by **Yi Xu**)
   - [example] Update quaternion arithmetics in fractal_3d_ggui (#7139) (by **Zhao Liang**)
   - [doc] Update field.md (Fields advanced) (#6867) (by **Gabriel Vainer**)
   - [ci] Use make_changelog.py to generate the full changelog (#7152) (by **Lin Jiang**)
   - [refactor] Rename Callable::*arg* to Callable::*param* (#7133) (by **PGZXB**)
   - [aot] Introduce new AOT deployment tutorial (#7144) (by **PENGUINLIONG**)
   - [bug] Unify error message matching with/without validation layers for CapiTest.FailMapDeviceOnlyMemory (#7110) (by **Zhanlue Yang**)
   - [lang] Remove redundant TensorType expansion for function returns (#7124) (by **Zhanlue Yang**)
   - [lang] Sign python library for Apple M1 (#7138) (by **PENGUINLIONG**)
   - [gui] Fix particle size limits (#7149) (by **Bob Cao**)
   - [lang] Migrate TensorType expansion in MatrixType/VectorType from Python code to Frontend IR (#7127) (by **Zhanlue Yang**)
   - [aot] Support texture arguments for AOT kernels (#7142) (by **Zhanlue Yang**)
   - [metal] Retain Metal commandBuffers & build command buffers directly (#7137) (by **Bob Cao**)
   - [rhi] Update `create_pipeline` API and add support of VkPipelineCache (#7091) (by **Bob Cao**)
   - [autodiff] Support grad in ndarray (#6906) (by **PhrygianGates**)
   - [Doc] Update doc regarding dynamic index (#7148) (by **Yi Xu**)
   - [refactor] Remove dependencies on Program::this_thread_config() in spirv::lower (#7134) (by **PGZXB**)
   - [Misc] Strictly check ndim with external array (#7126) (by **Haidong Lan**)
   - [ci] Run test when pushing to rc branches (#7146) (by **Lin Jiang**)
   - [refactor] Remove dependencies on Program::this_thread_config() in KernelCodeGen (#7086) (by **PGZXB**)
   - [ci] Disable backward_cpp on macOS (#7145) (by **Proton**)
   - [gui] Fix scene line renderable (#7131) (by **Bob Cao**)
   - [refactor] Remove useless Kernel::from_cache_ (#7132) (by **PGZXB**)
   - [cpu] Reuse VirtualMemoryAllocator for CPU ndarray memory allocation (#7128) (by **Ailing**)
   - [Lang] Raise errors when using the packed switch (#7125) (by **Yi Xu**)
   - [ci] Temporarily disable ad_external_array on Metal (#7136) (by **Bob Cao**)
   - [Error] Raise errors when using metal sparse (#7113) (by **Lin Jiang**)
   - [aot] AOT compat test in workflow (#7033) (by **damnkk**)
   - [Lang] Fix cannot use taichi in REPL (#7114) (by **Zhao Liang**)
   - [lang] Free ndarray memory when it's GC-ed in Python (#7072) (by **Ailing**)
   - [lang] Migrate TensorType expansion for FuncCallExpression from Python code to Frontend IR (#6980) (by **Zhanlue Yang**)
   - [amdgpu] Part2 add runtime (#6482) (by **Zeyu Li**)
   - [refactor] Remove dependencies on Program::this_thread_config() in codegen_cc.cpp (#7088) (by **PGZXB**)
   - [refactor] Remove dependencies on Program::this_thread_config() in gfx::run_codegen (#7089) (by **PGZXB**)
   - [Bug] Fix num_splits in parallel_struct_for (#7121) (by **Yi Xu**)
   - [Doc] Move glossary to top level (#7118) (by **Zhao Liang**)
   - [metal] Update Metal RHI impl & add support for shared arrays (#7107) (by **Bob Cao**)
   - [ci] Update amdgpu ci (#7117) (by **Zeyu Li**)
   - [refactor] Move Kernel::lower() outside the taichi::lang::Kernel (#7048) (by **PGZXB**)
   - [amdgpu] Part1 add codegen (#6469) (by **Zeyu Li**)
   - [Aot] Deprecate element shape and field dim for AOT symbolic args (#7100) (by **Haidong Lan**)
   - [refactor] Remove Program::current_ast_builder() (#7075) (by **PGZXB**)
   - [aot] Switch Metal to SPIR-V codegen (#7093) (by **PENGUINLIONG**)
   - [Lang] Remove deprecated ti.Matrix.rotation2d() (#7098) (by **Yi Xu**)
   - [doc] Modified some errors in the function examples (#7094) (by **welann**)
   - [ci] More Windows git hacks (#7102) (by **Proton**)
   - [Lang] Remove filename kwarg in aot Module save() (#7085) (by **Ailing**)
   - [aot] Rename device capability atomic_i64 to atomic_int64 for consistency (#7095) (by **PENGUINLIONG**)
   - [Lang] Remove sourceinspect deprecation warning message (#7081) (by **Zhao Liang**)
   - [example] Remove gui warning message (#7090) (by **Zhao Liang**)
   - [refactor] Remove unnecessary Kernel::arch (#7074) (by **PGZXB**)
   - [refactor] Remove unnecessary parameter of irpass::scalarize (#7087) (by **PGZXB**)
   - [Bug] Fix ret_type and cast_type of UnaryOpStmt in Scalarize (#7082) (by **Yi Xu**)
   - [lang] Migrate TensorType expansion for TextureOpExpression from Python code to Frontend IR (#6968) (by **Zhanlue Yang**)
   - [lang] Migrate TensorType expansion for ReturnStmt from Python code to Frontend IR (#6946) (by **Zhanlue Yang**)
   - [doc] Update ndarray deprecation warning to 1.5.0 (#7083) (by **Haidong Lan**)
   - [amdgpu] Update amdgpu module call (#7022) (by **Zeyu Li**)
   - [amdgpu] Add convert addressspace pass related unit test (#7023) (by **Zeyu Li**)
   - [ir] Let real function return nested StructType (by **lin-hitonami**)
   - [ir] Replace FuncCallExpression with FrontendFuncCallStmt (by **lin-hitonami**)
   - [example] Update gallery images (#7053) (by **Zhao Liang**)
   - [Doc] Update type.md (#7038) (by **Zhao Liang**)
   - [misc] Bump version to v1.5.0 (#7077) (by **Lin Jiang**)
   - [rhi] Update Stream `new_command_list` API (#7073) (by **Bob Cao**)
   - [Doc] Fix docstring (#7065) (by **Zhao Liang**)
   - [ci] Workaround windows checkout 'Needed a single revision' issue (#7078) (by **Proton**)
   - [Lang] Make slicing a single row/column of a matrix return a vector (#7068) (by **Yi Xu**)