-
-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CO-DETR #594
Comments
Dear @marcoslucianops , Thank you so much! I will try soon. |
Dear @marcoslucianops , I am trying to convert this model: https://download.openmmlab.com/mmdetection/v3.0/codetr/co_dino_5scale_swin_large_16e_o365tococo-614254c9.pth in this repo I have successfully convert from PyTorch to ONNX, but failed converting from ONNX to TRT. Modification made to # args.opset=16 must be used
print('Exporting the model to ONNX')
with torch.no_grad():
torch.onnx.export(
model, onnx_input_im, onnx_output_file, verbose=True, opset_version=args.opset, do_constant_folding=True,
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None,
) There is no quantization option, just FP32. Have you ever faced the following err? [12/05/2024-08:18:20] [TRT] [V] =============== Computing costs for {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]}
[12/05/2024-08:18:20] [TRT] [V] *************** Autotuning format combination: Bool(163840,512,1), Bool(40960,256,1), Bool(10240,128,1), Bool(2560,64,1), Bool(640,32,1), Float(41943040,1310720,1,1), Float(10485760,327680,1,1), Float(2621440,81920,1,1), Float(655360,20480,1,1), Float(163840,5120,1,1), Float(10485760,32768,64,1), Float(10485760,32768,64,1), Float(10485760,32768,64,1), Float(10485760,32768,64,1), Float(2621440,16384,64,1), Float(2621440,16384,64,1), Float(2621440,16384,64,1), Float(2621440,16384,64,1), Float(655360,8192,64,1), Float(655360,8192,64,1), Float(655360,8192,64,1), Float(655360,8192,64,1), Float(163840,4096,64,1), Float(163840,4096,64,1), Float(163840,4096,64,1), Float(163840,4096,64,1), Float(40960,2048,64,1), Float(40960,2048,64,1), Float(40960,2048,64,1), Float(40960,2048,64,1) -> Bool(218240,1,1), Float(20,20,4,1), Float(55869440,256,1), Float(3600,4,1), Float(18000,20,4,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1), Float(57600,64,1) ***************
[12/05/2024-08:18:20] [TRT] [V] --------------- Timing Runner: {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]} (Myelin[0x80000023])
[12/05/2024-08:18:50] [TRT] [V] Skipping tactic 0 due to insufficient memory on requested size of 11938766208 detected for tactic 0x0000000000000000.
[12/05/2024-08:18:50] [TRT] [V] {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]} (Myelin[0x80000023]) profiling completed in 29.4449 seconds. Fastest Tactic: 0xd15ea5edd15ea5ed Time: inf
[12/05/2024-08:18:50] [TRT] [W] No valid obedient candidate choices for node {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]} that meet the preferred precision. The remaining candidate choices will be profiled.
[12/05/2024-08:18:50] [TRT] [V] Deleting timing cache: 142 entries, served 89 hits since creation.
[12/05/2024-08:18:50] [TRT] [E] 10: Could not find any implementation for node {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]}.
[12/05/2024-08:18:50] [TRT] [E] 10: [optimizer.cpp::computeCosts::3869] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[/0/transformer/Flatten.../0/transformer/decoder/Slice]}.)
Traceback (most recent call last):
File "/models/onnx2trt.py", line 88, in <module>
sys.exit(build_engine(args) or 0)
File "/models/onnx2trt.py", line 66, in build_engine
f.write(engine.serialize())
AttributeError: 'NoneType' object has no attribute 'serialize' |
Please follow the steps on the doc. You should use the |
Thank you, I have successfully converted and tested the TRT engine by following your guideline. However, the speed is not improved comparing to that of PyTorch model, do you know any clue that leads to slow FPS of the generated TRT engine? |
Dear @marcoslucianops ,
Thanks so much for sharing your great work. I have been using your repo to obtain TRT models that can be used in DeepStream.
Could you consider supporting this model: https://github.com/open-mmlab/mmdetection/tree/main/projects/CO-DETR (https://github.com/Sense-X/Co-DETR)?
Here is a repo that Co-DERT is converted to TRT: https://github.com/DataXujing/Co-DETR-TensorRT
I really appreciate your time.
The text was updated successfully, but these errors were encountered: