Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

yakotoka · 2023-08-25T10:50:46Z

So it looks like newer generation diffsinger models now have linguistic models that take in tokens, word divisions and word durations where the output is encoder_out and x_masks which then feed to the duration.onnx model

Example below(please tell me the if zeroes are needed in the below example)
results = linguistic_model.run(None, {
"tokens":[[26, 1, 22, 35, 11]] ,
"word_div": [[3,2,0,0,0]],
"word_dur": [[48,24,0,0,0]]
})

Happy to get your thoughts, thank you!

yqzhishen · 2023-08-25T14:04:01Z

This project is deprecated now. You can use OpenUTAU for DiffSinger to synthesis with ONNX models. Anyway, this is only a simple demo project, and you can extend it or even re-write it easily

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

yakotoka commented Aug 25, 2023

yqzhishen commented Aug 25, 2023

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

Comments

yakotoka commented Aug 25, 2023

yqzhishen commented Aug 25, 2023