Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] GPU runtime support #59

Open
yuekaizhang opened this issue Mar 31, 2023 · 3 comments
Open

[feature request] GPU runtime support #59

yuekaizhang opened this issue Mar 31, 2023 · 3 comments

Comments

@yuekaizhang
Copy link

这个项目太硬核了。大佬有兴趣支持一下gpu的runtime吗?或者在 fastertransformer里面支持一下 paraformer 或者 k2的模型?类型下面wenet这样的
https://github.com/NVIDIA/FasterTransformer/tree/main/examples/cpp/wenet
https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/models/wenet

@chenkui164
Copy link
Owner

您好,像这样手写推理会比onnx+tensorrt推理快么?感觉onnx+不同的Providers方式,更加合理些。

@yuekaizhang
Copy link
Author

您好,像这样手写推理会比onnx+tensorrt推理快么?感觉onnx+不同的Providers方式,更加合理些。

cpu上的话,不确定这个手写推理有没有onnx快。gpu上的话,手写推理是最快的,也就是FasterTransformer这种形式,利用onnx去支持gpu的推理,远没有手写的快。 onnx + tensorrt的话,只要手写的没有大问题,一般也是手写的快。这也是为啥会有fastertransformer这种项目

@chenkui164
Copy link
Owner

好的,感谢。paraformer模型在CPU上确实比不过ONNX,在优化就涉及到CPU底层的内容了,有点优化不动了。我先看看wenet的GPU实现,学习下。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants