Skip to content
This repository has been archived by the owner on Dec 18, 2020. It is now read-only.

Mixed-precision training

Latest
Compare
Choose a tag to compare
@danieldk danieldk released this 01 Oct 13:42
· 4 commits to master since this release

The most important new feature of this release is mixed-precision training 🎉. This speeds up training and lowers memory use on GPUs with Tensor Cores. Mixed-precision training can be enabled using the --mixed-precision option of sticker2 finetune and sticker2 distill.

Other notable changes:

  • Use fast AVX2 kernels on AMD Zen CPUs, without setting any special environment variables.
  • Update the sentencepiece crate dependency to 0.4. This version compiles the sentencepiece library statically if it is not available, removing the dependency on an external sentencepiece build.
  • The TensorBoard summary writer support that was added in 0.4.2 is now feature-gated (tensorboard). This makes it possible to compile sticker2 without TensorBoard support for quicker compiles and smaller binaries.