Audio Encodec

Audio Encodec
- Survey
- Projects
- Audio Encodec
- Misc

Survey

Projects

S3Tokenizer - xingchensong

Audio Encodec

🌟 Scaling Transformers for Low-Bitrate High-Quality Speech Coding, arXiv, 2411.19842, arxiv, pdf, cication: -1

Julian D Parker, Anton Smirnov, Jordi Pons, ..., Zach Evans, Xubo Liu · (stable-codec - Stability-AI)
Towards Robust Speech Representation Learning for Thousands of Languages, arXiv, 2407.00837, arxiv, pdf, cication: 4

William Chen, Wangyou Zhang, Yifan Peng, ..., Karen Livescu, Shinji Watanabe · (wavlab)
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models, arXiv, 2410.24177, arxiv, pdf, cication: -1

Heng-Jui Chang, Hongyu Gong, Changhan Wang, ..., James Glass, Yu-An Chung
🌟 Continuous Speech Synthesis using per-token Latent Diffusion, arXiv, 2410.16048, arxiv, pdf, cication: -1

Arnon Turetzky, Nimrod Shabtay, Slava Shechtman, ..., Ron Hoory, Avihu Dekel · (s3.us-south.objectstorage.softlayer)
hertz-codec: a convolutional audio autoencoder that takes mono, 16kHz speech and transforms it into a 8 Hz latent representation at about 1kbps bitrate.
SNAC: Multi-Scale Neural Audio Codec, arXiv, 2410.14411, arxiv, pdf, cication: 2

Hubert Siuzdak, Florian Grötschla, Luca A. Lanzendörfer

· (snac - hubertsiuzdak)
DM-Codec: Distilling Multimodal Representations for Speech Tokenization, arXiv, 2410.15017, arxiv, pdf, cication: -1

Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, ..., Md Mofijul Islam, Amin Ahsan Ali

Misc