Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] have a problem when convert the pytorch model to onnx using tools/deploy.py #2846

Open
1 of 3 tasks
weizhitgp opened this issue Nov 22, 2024 · 1 comment
Open
1 of 3 tasks

Comments

@weizhitgp
Copy link

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

mmsegmentation-main/mmdeploy$ python tools/deploy.py \

configs/mmseg/segmentation_onnxruntime_static-512x512.py \
../configs/mobilenet_v3/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512.py \
../work_dirs/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512-1122/iter_80000.pth \
demo/15kdata_20240501_44.jpg \
--work-dir mmdeploy_models/mmseg/ort-1122

11/22 15:16:18 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
11/22 15:16:19 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
11/22 15:16:19 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "mmseg_tasks" registry tree. As a workaround, the current "mmseg_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/hsigmoid.py:35: UserWarning: In MMCV v1.4.4, we modified the default value of args to align with PyTorch official. Previous Implementation: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1). Current Implementation: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1).
warnings.warn(
check in_channels (16, 24, 960)
/mmsegmentation-main/mmsegmentation/mmseg/models/decode_heads/decode_head.py:121: UserWarning: For binary segmentation, we suggest usingout_channels = 1 to define the outputchannels of segmentor, and use thresholdto convert seg_logits into a predictionapplying a threshold
warnings.warn('For binary segmentation, we suggest using'
/mmsegmentation-main/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py:250: UserWarning: Default avg_non_ignore is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set avg_non_ignore=True.
warnings.warn(
Loads checkpoint by local backend from path: ../work_dirs/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512-1122/iter_80000.pth
11/22 15:16:20 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
11/22 15:16:20 - mmengine - INFO - Export PyTorch model to ONNX: mmdeploy_models/mmseg/ort-1122/end2end.onnx.
11/22 15:16:20 - mmengine - WARNING - Can not find torch.nn.functional.scaled_dot_product_attention, function rewrite will not be applied
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
ys_shape = tuple(int(s) for s in ys.shape)
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmdeploy/codebase/mmseg/models/segmentors/base.py:47: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
img_shape = [int(val) for val in img_shape]
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/conv2d_adaptive_padding.py:50: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
output_h = math.ceil(img_h / stride_h)
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/conv2d_adaptive_padding.py:51: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
output_w = math.ceil(img_w / stride_w)
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/conv2d_adaptive_padding.py:53: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max((output_h - 1) * self.stride[0] +
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/conv2d_adaptive_padding.py:56: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
max((output_w - 1) * self.stride[1] +
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/conv2d_adaptive_padding.py:58: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if pad_h > 0 or pad_w > 0:
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmdeploy/codebase/mmseg/models/segmentors/base.py:61: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if seg_logit.shape[1] == 1:
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/torch/onnx/utils.py:687: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at ../torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1884.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/torch/onnx/utils.py:687: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at ../torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/torch/onnx/utils.py:1178: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at ../torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1884.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/torch/onnx/utils.py:1178: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at ../torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
11/22 15:16:25 - mmengine - INFO - Execute onnx optimize passes.
11/22 15:16:25 - mmengine - WARNING - Can not optimize model, please build torchscipt extension.
More details: mmdeploy/docs/en/experimental/onnx_optimizer.md at main · open-mmlab/mmdeploy
11/22 15:16:25 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
11/22 15:16:26 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
11/22 15:16:26 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend
11/22 15:16:26 - mmengine - INFO - visualize onnxruntime model start.
11/22 15:16:30 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
11/22 15:16:30 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "mmseg_tasks" registry tree. As a workaround, the current "mmseg_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
11/22 15:16:30 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
11/22 15:16:30 - mmengine - INFO - Successfully loaded onnxruntime custom ops from /home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so
11/22 15:17:05 - mmengine - INFO - visualize onnxruntime model success.
11/22 15:17:05 - mmengine - INFO - visualize pytorch model start.
11/22 15:17:08 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
11/22 15:17:08 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "mmseg_tasks" registry tree. As a workaround, the current "mmseg_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
/home/itdtirtx/anaconda3/envs/tgp/lib/python3.8/site-packages/mmcv/cnn/bricks/hsigmoid.py:35: UserWarning: In MMCV v1.4.4, we modified the default value of args to align with PyTorch official. Previous Implementation: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1). Current Implementation: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1).
warnings.warn(
check in_channels (16, 24, 960)
/mmsegmentation-main/mmsegmentation/mmseg/models/decode_heads/decode_head.py:121: UserWarning: For binary segmentation, we suggest usingout_channels = 1 to define the outputchannels of segmentor, and use thresholdto convert seg_logits into a predictionapplying a threshold
warnings.warn('For binary segmentation, we suggest using'
/mmsegmentation-main/mmsegmentation/mmseg/models/losses/cross_entropy_loss.py:250: UserWarning: Default avg_non_ignore is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set avg_non_ignore=True.
warnings.warn(
Loads checkpoint by local backend from path: ../work_dirs/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512-1122/iter_80000.pth
11/22 15:17:44 - mmengine - INFO - visualize pytorch model success.
11/22 15:17:44 - mmengine - INFO - All process success.

Reproduction

python tools/deploy.py
configs/mmseg/segmentation_onnxruntime_static-512x512.py
../configs/mobilenet_v3/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512.py
../work_dirs/mobilenet-v3-d8_lraspp_4xb4-320k_skinseg-512x512-1122/iter_80000.pth
demo/15kdata_20240501_44.jpg
--work-dir mmdeploy_models/mmseg/ort-1122

Environment

11/22 16:29:57 - mmengine - INFO - **********Environmental information**********
11/22 16:29:58 - mmengine - INFO - sys.platform: linux
11/22 16:29:58 - mmengine - INFO - Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
11/22 16:29:58 - mmengine - INFO - CUDA available: True
11/22 16:29:58 - mmengine - INFO - MUSA available: False
11/22 16:29:58 - mmengine - INFO - numpy_random_seed: 2147483648
11/22 16:29:58 - mmengine - INFO - GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 3090
11/22 16:29:58 - mmengine - INFO - CUDA_HOME: /usr/local/cuda-11.2
11/22 16:29:58 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.2, V11.2.67
11/22 16:29:58 - mmengine - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
11/22 16:29:58 - mmengine - INFO - PyTorch: 1.13.1+cu117
11/22 16:29:58 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.4  (built against CUDA 11.6)
    - Built with CuDNN 8.5
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

11/22 16:29:58 - mmengine - INFO - TorchVision: 0.14.1+cu117
11/22 16:29:58 - mmengine - INFO - OpenCV: 4.10.0
11/22 16:29:58 - mmengine - INFO - MMEngine: 0.10.4
11/22 16:29:58 - mmengine - INFO - MMCV: 2.0.0rc4
11/22 16:29:58 - mmengine - INFO - MMCV Compiler: GCC 9.3
11/22 16:29:58 - mmengine - INFO - MMCV CUDA Compiler: 11.7
11/22 16:29:58 - mmengine - INFO - MMDeploy: 1.3.1+bc75c9d
11/22 16:29:58 - mmengine - INFO -

11/22 16:29:58 - mmengine - INFO - **********Backend information**********
11/22 16:29:58 - mmengine - INFO - tensorrt:    None
11/22 16:29:58 - mmengine - INFO - ONNXRuntime: 1.15.1
11/22 16:29:58 - mmengine - INFO - ONNXRuntime-gpu:     None
11/22 16:29:58 - mmengine - INFO - ONNXRuntime custom ops:      Available
11/22 16:29:58 - mmengine - INFO - pplnn:       None
11/22 16:29:58 - mmengine - INFO - ncnn:        None
11/22 16:29:58 - mmengine - INFO - snpe:        None
11/22 16:29:58 - mmengine - INFO - openvino:    None
11/22 16:29:58 - mmengine - INFO - torchscript: 1.13.1+cu117
11/22 16:29:58 - mmengine - INFO - torchscript custom ops:      NotAvailable
11/22 16:29:58 - mmengine - INFO - rknn-toolkit:        None
11/22 16:29:58 - mmengine - INFO - rknn-toolkit2:       None
11/22 16:29:58 - mmengine - INFO - ascend:      None
11/22 16:29:58 - mmengine - INFO - coreml:      None
11/22 16:29:58 - mmengine - INFO - tvm: None
11/22 16:29:58 - mmengine - INFO - vacc:        None
11/22 16:29:58 - mmengine - INFO -

11/22 16:29:58 - mmengine - INFO - **********Codebase information**********
11/22 16:29:58 - mmengine - INFO - mmdet:       None
11/22 16:29:58 - mmengine - INFO - mmseg:       1.2.2
11/22 16:29:58 - mmengine - INFO - mmpretrain:  None
11/22 16:29:58 - mmengine - INFO - mmocr:       None
11/22 16:29:58 - mmengine - INFO - mmagic:      None
11/22 16:29:58 - mmengine - INFO - mmdet3d:     None
11/22 16:29:58 - mmengine - INFO - mmpose:      None
11/22 16:29:58 - mmengine - INFO - mmrotate:    None
11/22 16:29:58 - mmengine - INFO - mmaction:    None
11/22 16:29:58 - mmengine - INFO - mmrazor:     None
11/22 16:29:58 - mmengine - INFO - mmyolo:      None

Error traceback

No response

@xzh929
Copy link

xzh929 commented Nov 22, 2024

maybe you should add "from mmdeploy.apis.tensorrt import onnx2tensorrt" in tools/deploy.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants