Skip to content

Latest commit

 

History

History
41 lines (25 loc) · 2.96 KB

audio_enc.md

File metadata and controls

41 lines (25 loc) · 2.96 KB

Audio Encodec

Survey

Projects

Audio Encodec

  • 🌟 Scaling Transformers for Low-Bitrate High-Quality Speech Coding, arXiv, 2411.19842, arxiv, pdf, cication: -1

    Julian D Parker, Anton Smirnov, Jordi Pons, ..., Zach Evans, Xubo Liu · (stable-codec - Stability-AI) Star

  • Towards Robust Speech Representation Learning for Thousands of Languages, arXiv, 2407.00837, arxiv, pdf, cication: 4

    William Chen, Wangyou Zhang, Yifan Peng, ..., Karen Livescu, Shinji Watanabe · (wavlab)

  • DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models, arXiv, 2410.24177, arxiv, pdf, cication: -1

    Heng-Jui Chang, Hongyu Gong, Changhan Wang, ..., James Glass, Yu-An Chung

  • 🌟 Continuous Speech Synthesis using per-token Latent Diffusion, arXiv, 2410.16048, arxiv, pdf, cication: -1

    Arnon Turetzky, Nimrod Shabtay, Slava Shechtman, ..., Ron Hoory, Avihu Dekel · (s3.us-south.objectstorage.softlayer)

  • hertz-codec: a convolutional audio autoencoder that takes mono, 16kHz speech and transforms it into a 8 Hz latent representation at about 1kbps bitrate.

  • SNAC: Multi-Scale Neural Audio Codec, arXiv, 2410.14411, arxiv, pdf, cication: 2

    Hubert Siuzdak, Florian Grötschla, Luca A. Lanzendörfer

    · (snac - hubertsiuzdak) Star

  • DM-Codec: Distilling Multimodal Representations for Speech Tokenization, arXiv, 2410.15017, arxiv, pdf, cication: -1

    Md Mubtasim Ahasan, Md Fahim, Tasnim Mohiuddin, ..., Md Mofijul Islam, Amin Ahsan Ali

Misc