whisper 并发推理问题 #650

xqun3 · 2024-09-23T09:15:13Z

Hi @yuekaizhang，感谢分享代码，很棒的工作！

但是我在实际部署使用时发现一个问题，模型在部署以后，发起并发调用，并没有看到batch的效果，而是按照并发的大小推理时间成倍增加，是因为本身的实现并不支持triton组batch？我的batch相关配置如下：

dynamic_batching {
    preferred_batch_size: [ 4, 8]
    max_queue_delay_microseconds: 100
  }

The text was updated successfully, but these errors were encountered:

yuekaizhang · 2024-09-23T09:20:36Z

@xqun3 https://github.com/yuekaizhang/Triton-ASR-Client/blob/main/log/stats_summary.txt 可以跑跑这个项目里的 client 调试一下，通过上面生成的文件看看实际推理的 batch 和配置之间的关系。

另外最近会更新 inflight batching 的支持，会在现在的代码基础上再提升 20% 以上的吞吐，可以关注一下

xqun3 · 2024-09-30T10:23:24Z

@yuekaizhang 感谢回复，我最近也看到 Tensorrt-llm 有更新支持，但是我看 tensorrtllm_backend 这个 repo 目前还没有更新，尝试使用python backend 部署会报错

yuekaizhang · 2024-11-19T00:54:34Z

Hi @yuekaizhang，感谢分享代码，很棒的工作！

但是我在实际部署使用时发现一个问题，模型在部署以后，发起并发调用，并没有看到batch的效果，而是按照并发的大小推理时间成倍增加，是因为本身的实现并不支持triton组batch？我的batch相关配置如下：
dynamic_batching {
    preferred_batch_size: [ 4, 8]
    max_queue_delay_microseconds: 100
  }

@xqun3 还要检查一下 client 端发送的音频是不是长度都是一样的，如果不一样需要统一 padding 到30秒，不然不会组到一个 batch 中

…mmit isort==5.13.2

…sort==5.13.2 (#683)

upskyy added a commit to upskyy/sherpa that referenced this issue Dec 13, 2024

Add whisper tritonserver batch inference (k2-fsa#650) / update pre-co…

47653f1

…mmit isort==5.13.2

upskyy mentioned this issue Dec 13, 2024

Add whisper tritonserver batch inference (#650) #683

Merged

upskyy added a commit to upskyy/sherpa that referenced this issue Dec 13, 2024

Add whisper tritonserver batch inference (k2-fsa#650) / update pre-co…

3f19c55

…mmit isort==5.13.2

yuekaizhang pushed a commit that referenced this issue Dec 13, 2024

Add whisper tritonserver batch inference (#650) / update pre-commit i…

4c11c66

…sort==5.13.2 (#683)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper 并发推理问题 #650

whisper 并发推理问题 #650

xqun3 commented Sep 23, 2024

yuekaizhang commented Sep 23, 2024

xqun3 commented Sep 30, 2024

yuekaizhang commented Nov 19, 2024

whisper 并发推理问题 #650

whisper 并发推理问题 #650

Comments

xqun3 commented Sep 23, 2024

yuekaizhang commented Sep 23, 2024

xqun3 commented Sep 30, 2024

yuekaizhang commented Nov 19, 2024