🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
-
Updated
Nov 1, 2024 - Python
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
SoftVC VITS Singing Voice Conversion
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
kaldi-asr/kaldi is the official location of the Kaldi project.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
ModelScope: bring the notion of Model-as-a-Service to life.
Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
💬 Speech recognition for your site
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Foundational model for human-like, expressive TTS
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Code examples for new APIs of iOS 10.
Lingvo
Add a description, image, and links to the speech topic page so that developers can more easily learn about it.
To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."