Releases: alibaba/heterogeneity-aware-lowering-and-optimization
v0.7.0
This release contains the following major changes since v0.6.0:
1. Just-In-Time Compilation
HALO supports the binary object code generation and linking at runtime.
2. More op support
More ops support for ONNX, Tensorflow and Caffe format. Random uniform.PRELU, strided slice, resize, reduction
3. Model Analyzer
HALO now includes a model analyzer that can analyze the computation and memory demand for model inference.
4. Profiling-Guided Quantization
HALO PGQ now supports channel-wise quantization
5. Multiple stability fixes
v0.6.1
- This is a patch release for v0.6.0. It fixes:
- Random build error due to dependency issue
v0.6.0
This release contains the following major changes since our initial start:
-
Enable Graphcore IPU
HALO supports the deployment of AI models on Graphcore IPU now. -
Enable BF16 mode on supported Intel CPUs
AI models can gain significant performance boosts on Intel cpus that support BF16 instructions. -
Enhanced testing framework
Now, each change will be covered by 2000+ regression tests automatically.
More details:
- Automated continuous integration workflow
- New ODLA runtime library for Graphcore (a lot of contributions from Graphcore)
- Optimized ODLA for DNNL (optimized for BF16, AVX 512) (thanks for the significant support from Intel)
- Mixed precision for BF16/FP32 support
- Profiling-guided quantization support
- Enhanced unit tests (2000+ test cases)
- Enhanced model analyze tool to analysis memory and computational resources usage
- Model zoo regression tests suites
- More op support
- More fusion pattern support ( conv + batch_norm, scale + conv, ...)
- Tools to convert standard TF graphed to to Graphcore style
- Bug fixes