Skip to content

Releases: alibaba/heterogeneity-aware-lowering-and-optimization

v0.7.0

05 Aug 18:52
Compare
Choose a tag to compare

This release contains the following major changes since v0.6.0:

1. Just-In-Time Compilation
HALO supports the binary object code generation and linking at runtime.

2. More op support
More ops support for ONNX, Tensorflow and Caffe format. Random uniform.PRELU, strided slice, resize, reduction

3. Model Analyzer
HALO now includes a model analyzer that can analyze the computation and memory demand for model inference.

4. Profiling-Guided Quantization
HALO PGQ now supports channel-wise quantization

5. Multiple stability fixes

v0.6.1

05 Feb 06:43
Compare
Choose a tag to compare
  • This is a patch release for v0.6.0. It fixes:
  1. Random build error due to dependency issue

v0.6.0

22 Jan 08:34
9463d2f
Compare
Choose a tag to compare

This release contains the following major changes since our initial start:

  1. Enable Graphcore IPU
    HALO supports the deployment of AI models on Graphcore IPU now.

  2. Enable BF16 mode on supported Intel CPUs
    AI models can gain significant performance boosts on Intel cpus that support BF16 instructions.

  3. Enhanced testing framework
    Now, each change will be covered by 2000+ regression tests automatically.

More details:

  • Automated continuous integration workflow
  • New ODLA runtime library for Graphcore (a lot of contributions from Graphcore)
  • Optimized ODLA for DNNL (optimized for BF16, AVX 512) (thanks for the significant support from Intel)
  • Mixed precision for BF16/FP32 support
  • Profiling-guided quantization support
  • Enhanced unit tests (2000+ test cases)
  • Enhanced model analyze tool to analysis memory and computational resources usage
  • Model zoo regression tests suites
  • More op support
  • More fusion pattern support ( conv + batch_norm, scale + conv, ...)
  • Tools to convert standard TF graphed to to Graphcore style
  • Bug fixes