We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
这个项目太硬核了。大佬有兴趣支持一下gpu的runtime吗?或者在 fastertransformer里面支持一下 paraformer 或者 k2的模型?类型下面wenet这样的 https://github.com/NVIDIA/FasterTransformer/tree/main/examples/cpp/wenet https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/models/wenet
The text was updated successfully, but these errors were encountered:
您好,像这样手写推理会比onnx+tensorrt推理快么?感觉onnx+不同的Providers方式,更加合理些。
Sorry, something went wrong.
cpu上的话,不确定这个手写推理有没有onnx快。gpu上的话,手写推理是最快的,也就是FasterTransformer这种形式,利用onnx去支持gpu的推理,远没有手写的快。 onnx + tensorrt的话,只要手写的没有大问题,一般也是手写的快。这也是为啥会有fastertransformer这种项目
好的,感谢。paraformer模型在CPU上确实比不过ONNX,在优化就涉及到CPU底层的内容了,有点优化不动了。我先看看wenet的GPU实现,学习下。
No branches or pull requests
这个项目太硬核了。大佬有兴趣支持一下gpu的runtime吗?或者在 fastertransformer里面支持一下 paraformer 或者 k2的模型?类型下面wenet这样的
https://github.com/NVIDIA/FasterTransformer/tree/main/examples/cpp/wenet
https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/models/wenet
The text was updated successfully, but these errors were encountered: