Partially garbled audio on Huggingface online demo, for a short English input #709

rotemdan · 2024-12-05T05:53:11Z

Self Checks

This template is only for bug reports. For questions, please visit Discussions.
I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
I have searched for existing issues, including closed ones. Search issues
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Cloud

Environment Details

Huggingface online demo (V1.5 medium)

Steps to Reproduce

Tried to synthesize the text "We are not responsible for any misuse of the model, please consider your local laws and regulations before using it."

All parameters are left as default:

✔️ Expected Behavior

Normal speech.

❌ Actual Behavior

The audio starts normally as "We are not", and then followed by garbled audio, that sounds sped-up. The total audio duration is only 3 seconds.

2024-12-05.07-38-05.mp4

audio.zip

I got this after trying the model with only 2 test inputs, meaning that it's not that rare. If I try to synthesize the same text several times again, I get other voices, and they don't seem to have this issue (as much as I've tested).

Related issues

Seems closely related to issue #632, but I decided to open a new issue because:

My input was in English
Issue #632 is described as "Swallowing words, reading normally at first, then speeding up, and then not reading the last word of the sentence completely", but it's not exactly what is seen here. Here it starts normally but continues with a completely garbled audio
A comment on that issue says the issue is resolved
It was produced by the online demo, and the latest model (1.5 medium)

The text was updated successfully, but these errors were encountered:

rotemdan added the bug Something isn't working label Dec 5, 2024

rotemdan changed the title ~~Garbled speech on Huggingface demo, for a short English prompt~~ Partially garbled audio on Huggingface demo, for a short English prompt Dec 5, 2024

rotemdan changed the title ~~Partially garbled audio on Huggingface demo, for a short English prompt~~ Partially garbled audio on Huggingface online demo, for a short English input Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partially garbled audio on Huggingface online demo, for a short English input #709

Partially garbled audio on Huggingface online demo, for a short English input #709

rotemdan commented Dec 5, 2024 •

edited

Loading

Partially garbled audio on Huggingface online demo, for a short English input #709

Partially garbled audio on Huggingface online demo, for a short English input #709

Comments

rotemdan commented Dec 5, 2024 • edited Loading

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

Related issues

rotemdan commented Dec 5, 2024 •

edited

Loading