Skip to content

[Feature Request] Expose an API to detect compiled model #26687

@vlejeune-dxo

Description

@vlejeune-dxo

Describe the feature request

In the doc related to ep_context cache (https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html) it is written that :

`The EP or its backend SDK should be capable of detecting common failure scenarios (including but not limited to the following). In such cases, the EP should return a status with the INVALID_GRAPH status code:

Detect mismatches between the driver version and the version required by the EP context binary; return an error if they are incompatible.
Detect mismatches between the runtime SDK version and the version used to generated the EP context binary; return an error if they are incompatible.`

However it means one has to create a session to detect such mismatch ; to create a session involves allocating some resources and probably doing some JIT for some EP, which takes time.

We would like to have a separate API that can check if there is a mismatch, without actually creating a session. This could be part of the compilation API.

Describe scenario use case

We use the compilation API at the moment and sometime load models from disk, and sometime we load them from buffer (when we need to hide model weights).

Our use case is the following :

  • at application startup, we detect if a model needs to be compiled or not, and if it does we compile it : the goal is to avoid compilation during the usage of the application.
  • We currently check if a model needs to be compiled by creating a session, listening for an ort error, and deleting it afterward (if we don't we will fill the memory too quickly)
  • The issue is that it takes a dozen seconds to just create sessions for all our models, even if there is nothing to compile (which is the nominal case).

Metadata

Metadata

Assignees

Labels

feature requestrequest for unsupported feature or enhancement

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions