#

text-to-audio

Here are 49 public repositories matching this topic...

Amphion

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

text-to-speech audit speech-synthesis audio-synthesis music-generation voice-conversion vocoder emilia text-to-audio fastspeech2 vits audio-generation singing-voice-conversion vall-e audioldm naturalspeech2 maskgct

Updated Jan 2, 2025
Python

tango

declare-lab / tango

A family of diffusion models for text-to-audio generation.

language-models diffusion diffusion-models text-to-audio audio-generation large-language-models

Updated Dec 31, 2024
Python

gitmylo / audio-webui

A webui for different audio related Neural Networks

music text-to-speech ai generative-audio aio artificial-intelligence tts bark rvc all-in-one generative-music voice-cloning text-to-audio audioldm audiocraft bark-gui rvc-gui

Updated Aug 16, 2024
Python

ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Aug 24, 2024
Python

hkchengrex / MMAudio

[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

audio computer-vision deep-learning audio-synthesis video-to-audio text-to-audio

Updated Jan 9, 2025
Python

Text-to-Audio / Make-An-Audio

PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

latent-space video-to-audio diffusion-models text-to-audio latent-diffusion

Updated May 22, 2024
Python

lucidrains / nuwa-pytorch

Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

deep-learning transformers artificial-intelligence attention-mechanism text-to-audio text-to-video

Updated Jan 17, 2023
Python

ivcylc / OpenMusic

OpenMusic: SOTA Text-to-music (TTM) Generation

ai music-generation mdt dit ai-music diffusion-models text-to-audio music-ai ai-music-generator music-ai-architectures hifi-gan text-to-music vall-e text-to-audio-ai audioldm diffusion-transformer ai-music-generation text-to-music-transformer

Updated Jan 1, 2025
Python

declare-lab / TangoFlux

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

tta text-to-audio generative-ai text-to-audio-ai flow-matching

Updated Jan 12, 2025
Jupyter Notebook

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation

🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

text-to-speech multimodality text-to-image text-to-audio text-to-video text-to-music multimodal-models aigc large-language-models llm text-to-3d multimodal-generation mllm text-to-sound large-vision-language-models multimodal-large-language-models lvlm

Updated Dec 24, 2024
HTML

mustango

AMAAI-Lab / mustango

Mustango: Toward Controllable Text-to-Music Generation

diffusion-models text-to-audio text-to-music large-language-models

Updated Jul 24, 2024
Python

haidog-yaqub / EzAudio

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

diffusion-models text-to-audio generative-ai

Updated Nov 12, 2024
Python

happylittlecat2333 / Auffusion

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

diffusion diffusion-models text-to-audio audio-generation large-language-models

Updated Mar 25, 2024
Jupyter Notebook

ilaria-manco / word2wave

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

music-generation ai-music text-to-audio audio-generation

Updated Dec 13, 2021
Python

bnsantoso / sub-to-audio

Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.

python text-to-speech tts audio-processing subtitle-conversion text-to-audio subtitle-to-speech subtitle-to-voice subtitle-to-audio

Updated Dec 14, 2023
Python

sony / soundctm

Pytorch implementation of SoundCTM

pytorch diffusion-models text-to-audio audio-generation

Updated Dec 4, 2024
Python

keonlee9420 / WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

audio text-to-speech duration end-to-end pytorch tts speech-synthesis robust synthesis neural-tts non-autoregressive text-to-audio score-matching phoneme-to-waveform

Updated Aug 3, 2021
Python

serp-ai / ai-text-to-audio-latent-diffusion

text-to-audio-latent-diffusion

text-to-audio latent-diffusion audio-diffusion text-to-audio-ai latent-audio-diffusion audio-ai ai-audio-generation

Updated Aug 25, 2023
Python

RhythrosaLabs / soundstorm

Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.

midi chatbot sound sound-processing gpt algorithmic-music algorithmic-composition sounds audio-processing random-music audio-tools sound-design text-to-audio audio-toolbox ai-audio gpt-4 chatgpt chat-gpt ai-audio-generation

Updated May 4, 2024
Python

PapayaResearch / ctag

Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24

machine-learning synthesizer jax text-to-audio generative-ai

Updated Sep 26, 2024
Python

Improve this page

Add a description, image, and links to the text-to-audio topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the text-to-audio topic, visit your repo's landing page and select "manage topics."