Use Nvidia optimum to speed up inference #111

SKocur · 2023-12-06T11:29:00Z

Hello! For Nvidia based workstations, is it possible to use Nvidia Optimum pipelines instead of Hugging Face default ones to gain speed in Whisper token generation? I have not tested it though. Here is the referenced article mentioning gains in LLaMA based models: https://huggingface.co/blog/optimum-nvidia and https://github.com/huggingface/optimum-nvidia

Vaibhavs10 · 2023-12-17T13:08:46Z

Let's keep the conversation in the PR (when optimum-nvidia makes a release). I am closing this issue.

SKocur mentioned this issue Dec 6, 2023

Add support for Nvidia optimum #113

Draft

Vaibhavs10 closed this as completed Dec 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Nvidia optimum to speed up inference #111

Use Nvidia optimum to speed up inference #111

SKocur commented Dec 6, 2023 •

edited

Loading

Vaibhavs10 commented Dec 17, 2023

Use Nvidia optimum to speed up inference #111

Use Nvidia optimum to speed up inference #111

Comments

SKocur commented Dec 6, 2023 • edited Loading

Vaibhavs10 commented Dec 17, 2023

SKocur commented Dec 6, 2023 •

edited

Loading