This repository has been archived by the owner on Aug 10, 2023. It is now read-only.

v0.3.3

Pre-release

Pre-release

hfxunlp released this 22 Feb 00:43

· 13 commits to master since this release

787268e

fix decoding efficiency by moving decoding cache from attention inputs to attention hiddens;
support shared vocabulary pruning of trained models.

Assets 2