This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 526 45 Updated Jun 9, 2024

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 167 11 Updated Jul 12, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,404 163 Updated Jun 25, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 39,500 4,841 Updated Feb 6, 2025

bytedance / SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Python 1,134 89 Updated Dec 12, 2024

EtienneAb3d / WhisperHallu

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

Python 298 21 Updated Nov 12, 2024

sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++

Python 1,200 137 Updated Feb 20, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,399 481 Updated Aug 10, 2024

yanghaha0908 / FastHuBERT

Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

Python 84 5 Updated Nov 20, 2024

dubverse-ai / MahaTTS

Python 258 18 Updated Jun 8, 2024

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 30,732 3,075 Updated Jan 7, 2025

LSimon95 / megatts2

Unoffical implementation of Megatts2

Python 274 35 Updated Mar 23, 2024

PolyAI-LDN / pheme

Python 255 24 Updated Mar 15, 2024

cpdu / vallt

36 1 Updated Jan 28, 2024

metavoiceio / metavoice-src

Foundational model for human-like, expressive TTS

Python 4,015 675 Updated Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cnlinxi

Achievements

Achievements

Highlights

Block or report cnlinxi

LLM

LianjiaTech / BELLE

lm-sys / FastChat

voidful / vall-e-encodec

AI21Labs / Parallel-Context-Windows

yangdongchao / SoundStorm

THUDM / VisualGLM-6B

yangdongchao / AcademiCodec

hollobit / GenAI_LLM_timeline

SpeechifyInc / Meta-voicebox

salesforce / DialogStudio

pengzhile / pandora

Plachtaa / VALL-E-X

haoheliu / AudioLDM2

facebookresearch / seamless_communication

descriptinc / descript-audio-codec

ZhangXInFD / SpeechTokenizer