This fork is exactly the same as the original tortoise-tts-fast but adds deepspeed for even faster inference speed
The original tortoise repo has many of these changes merged in but it does not allow you to import finetuned tortoise models
This is a working project to drastically boost the performance of TorToiSe, without modifying the base models. Expect speedups of 5~10x
This repo adds the following config options for TorToiSe for faster inference:
- New September 2023 - Deepspeed
- (
--kv_cache
) enabling of KV cache for MUCH faster GPT sampling - (
--half
) half precision inference where possible - (
--sampler dpm++2m
) DPM-Solver samplers for better diffusion - (disable with
--low_vram
) option to toggle cpu offloading, for high vram users
All changes in this fork are licensed under the AGPL. For avoidance beyond all doubt, the following statement is added as a comment to all changed code files:
AGPL: a notification must be added stating that changes have been made to that file.
The installation process is identical to the original tortoise-tts repo.
conda create --name tortoise python=3.9 numba inflect
conda activate tortoise
conda install pytorch==2.0.0 torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install transformers=4.29.2
pip install -r requirements.txt
pip3 install git+https://github.com/152334H/BigVGAN.git
pip install deepspeed=0.10.2 # linux/WSL only
If you are on windows, you will also need to install pysoundfile: conda install -c conda-forge pysoundfile