How does TransformerEngine interact with TensorRT/Onnx? #238

tylerweitzman · 2023-05-10T20:39:25Z

tylerweitzman
May 10, 2023

Hi,
We are often converting PyTorch models to other engines in order to optimize inference speed. Is it possible to convert a PyTorch model that is using TransformerEngine into TensorRT for faster inference? Or is it a choice either between nn.Linear with TensorRT right now?

nzmora-nvidia · 2023-05-13T22:55:29Z

nzmora-nvidia
May 13, 2023

Hi @tylerweitzman,

We are evaluating support for exporting TE models to ONNX and import from TensorRT. Support for FP8 GPT export will be rolled out first.
Can you share what kind of models and precisions are you interested in?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does TransformerEngine interact with TensorRT/Onnx? #238

{{title}}

Replies: 1 comment

{{title}}

Select a reply

How does TransformerEngine interact with TensorRT/Onnx? #238

tylerweitzman May 10, 2023

Replies: 1 comment

nzmora-nvidia May 13, 2023

tylerweitzman
May 10, 2023

nzmora-nvidia
May 13, 2023