Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How generate a contxt model dump from onnx runtime? (C++) #23153

Open
MickaMickaMicka opened this issue Dec 19, 2024 · 0 comments
Open

How generate a contxt model dump from onnx runtime? (C++) #23153

MickaMickaMicka opened this issue Dec 19, 2024 · 0 comments
Labels
platform:jetson issues related to the NVIDIA Jetson platform

Comments

@MickaMickaMicka
Copy link

MickaMickaMicka commented Dec 19, 2024

Describe the issue

Situation:
I am loading an onnx model (yolo v5) with TensorRT provider, which takes 4 Minutes on a Jetson Orin.
I successfully speeded this up by caching the TensorRT engine.

OrtTensorRTProviderOptions trt_options{};
	trt_options.device_id = 0;
	trt_options.trt_max_workspace_size = 2147483648;
	//trt_options.trt_max_partition_iterations = 10;
	trt_options.trt_min_subgraph_size = 1;
	trt_options.trt_fp16_enable = 0;
	trt_options.trt_int8_enable = 0;
	//trt_options.trt_int8_use_native_calibration_table = 1;
	trt_options.trt_engine_cache_enable = 1;
	//trt_options.trt_dump_ep_context_model = 1; // desired but not available in trt_options, only in trt2
	trt_options.trt_engine_cache_path = "./cache";
	//trt_options.trt_dump_subgraphs = 1;  
	session_options.AppendExecutionProvider_TensorRT(trt_options);	// add trt-options to session-options!

But I am not certain about the model security when saving the engine (we are currently loading the model from RAM, so no files are exposed to the a user who has access to the system). Is the trt engine secure, or could anyone generate inference just from that engine file? Especially: Are the model weights inside of the engine, or is the engine just some kind of "meta data" only works in combination with the model file itself (both in ONNX and native TensorRT and some theoretical custom inference engines)?

That's why I would like to embed the engine to a onnx file (context model) and load that model from RAM as before.
If I understand correctly, that should be possible?

For that I add another Provider in addition to the OrtTensorRTProviderOptions trt_options:

		std::vector<const char*> option_keys2 = {
			"trt_engine_cache_enable"
			,"trt_dump_ep_context_model"
			,"trt_ep_context_file_path"
			,"ep_context_enable"
			,"ep_context_file_path"
			,"trt_ep_context_embed_mode"
			,"trt_engine_cache_path"
			//,"trt_timing_cache_enable"
			//,"trt_timing_cache_path"
		};
		std::vector<const char*> option_values2 = {
			"1"
			,"1"
			,"/path1" // sub-path, according to https://app.semanticdiff.com/gh/microsoft/onnxruntime/pull/19154/overview
			,"1"
			,"/path2" // base path, according to https://app.semanticdiff.com/gh/microsoft/onnxruntime/pull/19154/overview
			,"1"
			,"/path3"
			//,"1"
			//,"/path4"
		}; 
		
		Ort::ThrowOnError(api.CreateTensorRTProviderOptions(&tensorrt2_options));

                    Ort::ThrowOnError(api.UpdateTensorRTProviderOptions(tensorrt2_options, option_keys2.data(), option_values2.data(), option_keys2.size()));

                    session_options.AppendExecutionProvider_TensorRT_V2(*tensorrt2_options);	// add trt2-options to session-options!

However, there I am getting a

[ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:146 Parse Unknown provider option: "trt_ep_context_embed_mode".

How to do it correctly?

To reproduce

Urgency

No response

Platform

Other / Unknown

OS Version

Jetson Orin Linux

ONNX Runtime Installation

Other / Unknown

ONNX Runtime Version or Commit ID

11.4

ONNX Runtime API

C++

Architecture

X64

Execution Provider

TensorRT

Execution Provider Library Version

No response

@github-actions github-actions bot added the platform:jetson issues related to the NVIDIA Jetson platform label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:jetson issues related to the NVIDIA Jetson platform
Projects
None yet
Development

No branches or pull requests

1 participant