Releases · alibaba/heterogeneity-aware-lowering-and-optimization

05 Aug 18:52

v0.7.0

This release contains the following major changes since v0.6.0:

1. Just-In-Time Compilation
HALO supports the binary object code generation and linking at runtime.

2. More op support
More ops support for ONNX, Tensorflow and Caffe format. Random uniform.PRELU, strided slice, resize, reduction

3. Model Analyzer
HALO now includes a model analyzer that can analyze the computation and memory demand for model inference.

4. Profiling-Guided Quantization
HALO PGQ now supports channel-wise quantization

5. Multiple stability fixes

Assets 5

05 Feb 06:43

weimingzha0

v0.6.1

Assets 2

22 Jan 08:34

weimingzha0

v0.6.0

This release contains the following major changes since our initial start:

Enable Graphcore IPU
HALO supports the deployment of AI models on Graphcore IPU now.
Enable BF16 mode on supported Intel CPUs
AI models can gain significant performance boosts on Intel cpus that support BF16 instructions.
Enhanced testing framework
Now, each change will be covered by 2000+ regression tests automatically.

More details:

Automated continuous integration workflow
New ODLA runtime library for Graphcore (a lot of contributions from Graphcore)
Optimized ODLA for DNNL (optimized for BF16, AVX 512) (thanks for the significant support from Intel)
Mixed precision for BF16/FP32 support
Profiling-guided quantization support
Enhanced unit tests (2000+ test cases)
Enhanced model analyze tool to analysis memory and computational resources usage
Model zoo regression tests suites
More op support
More fusion pattern support ( conv + batch_norm, scale + conv, ...)
Tools to convert standard TF graphed to to Graphcore style
Bug fixes

Assets 2

Provide feedback