不能复现对论文里对moshi的评测 #141

UltraEval · 2024-12-18T12:13:58Z

感谢开源👍🏻

在论文《Scaling speech-text pre-training with synthetic interleaved data》里
对moshi的评测，

其他模型的评测结果都可以复现，但是无法复现Moshi这个评测，可以分享下Moshi的推理吗？🙏🙏🙏

tianzhangwu · 2024-12-25T06:21:50Z

请问你这里测试时，输入音频是采用什么TTS工具合成的呢呢？不同TTS工具分数也会不同。

UltraEval · 2024-12-25T06:26:04Z

请问你这里测试时，输入音频是采用什么TTS工具合成的呢呢？不同TTS工具分数也会不同。

火山引擎seed TTS

我现在推理时，moshi基本上只回复“How are you”类似的开场白，偶尔能在开场白之后正式回复

tianzhangwu · 2024-12-25T06:30:05Z

我这里尝试用meloTTS和chattts分别测试过开源出来的GLM-4-voice，比他们论文里的结果要高一些，比技术报告里的分数要低。在4.5分上下。

Moshi我还没测过，不过之前看过知乎的讨论，这个模型经常不能正常回答问题。这里的现象看起来是prompt mismatch了。

tianzhangwu · 2024-12-25T06:30:31Z

不知道你有没有测试过开源出来的GLM-4-voice的效果？

UltraEval · 2024-12-25T06:37:39Z

不知道你有没有测试过开源出来的GLM-4-voice的效果？

复现了，和论文里的结果基本一致

tianzhangwu · 2024-12-25T06:38:43Z

指的是3.69这个指标吗？

tianzhangwu · 2024-12-25T06:40:15Z

他们技术报告的结果是5.40，不知道你这里复现出来更接近哪一个啊

UltraEval · 2024-12-25T06:40:16Z

我这里尝试用meloTTS和chattts分别测试过开源出来的GLM-4-voice，比他们论文里的结果要高一些，比技术报告里的分数要低。在4.5分上下。

Moshi我还没测过，不过之前看过知乎的讨论，这个模型经常不能正常回答问题。这里的现象看起来是prompt mismatch了。

Moshi 的推理好像有些问题，我之前要给他们提过issue, 还没有官方回复，有人说开源和论文里的不一样kyutai-labs/moshi#159

UltraEval · 2024-12-25T06:41:15Z

他们技术报告的结果是5.40，不知道你这里复现出来更接近哪一个啊

是5.4这个，前面的Speech2Text虽然论文里是用Base模型，我使用开源的chat模型结果也差不多

tianzhangwu · 2024-12-25T07:25:29Z

谢谢，我估计可能是这个细节没对齐：
For a fair comparison with the English-only baseline models, we restrict the output of GLM-4-Voice to English tokens when evaluating the tasks reported in Table 6.

实测确实发现不少问英文，回答中文的case。

UltraEval · 2024-12-25T08:24:38Z

谢谢，我估计可能是这个细节没对齐： For a fair comparison with the English-only baseline models, we restrict the output of GLM-4-Voice to English tokens when evaluating the tasks reported in Table 6.

实测确实发现不少问英文，回答中文的case。

这个情况我在review case的时候发现了，但是论文里的这个限制有点不合理，不过也无伤大雅。
这里的Knowledge也无法复现，因为是随机了100个数据；
UTMOS的测试llama-omni和论文里对不上，这里的有可能是使用音频数据问题，不过听了下Llama-Omni的音频问题还是很明显的生硬停顿，不知道为什么这么高分

Sengxian assigned Sengxian and Btlmd and unassigned Sengxian Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

不能复现对论文里对moshi的评测 #141

不能复现对论文里对moshi的评测 #141

UltraEval commented Dec 18, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

不能复现对论文里对moshi的评测 #141

不能复现对论文里对moshi的评测 #141

Comments

UltraEval commented Dec 18, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024

UltraEval commented Dec 25, 2024

tianzhangwu commented Dec 25, 2024

UltraEval commented Dec 25, 2024