[TRT RTX EP] Add support for D3D12 external resource import #26948

gedoensmax · 2026-01-08T19:38:35Z

This PR adds a test for CiG inference to demonstrate how usage for it should look like.
It is important to not call cudaSetDevice in that flow since it will create a new context. @nieubank I am not sure why there was a cudaSetDevice on each import call 🤔 Is this done to enable importing semaphores of e.g. GPU:1 to a session running on GPU:0 ? Context management is unreliable with the current CUDA runtime and CUDA driver API mixing.

nieubank · 2026-01-08T21:07:31Z

This PR adds a test for CiG inference to demonstrate how usage for it should look like. It is important to not call cudaSetDevice in that flow since it will create a new context. @nieubank I am not sure why there was a cudaSetDevice on each import call 🤔 Is this done to enable importing semaphores of e.g. GPU:1 to a session running on GPU:0 ? Context management is unreliable with the current CUDA runtime and CUDA driver API mixing.

Awesome, thanks for this! You're seeing my inexperience with the CUDA API here :), I have another branch I was working on to fix some of the context stuff, but I figure this implementation will be a longer-term collaboration/hand-off at some point. Just wanted to validate the API with some real code.

gedoensmax · 2026-01-09T11:03:19Z

Yes sure, we (or in other words @praneshgo ) will probably take it over. I made these changes for the exact same reason to experiment with it. And i already identified a TRT RTX optimization opportunity that we will fix internally.

To better test the correct async behaviour by the way it might be better to submit multiple inferences and wait on the last result to ensure that we are not synchronous due to CPU overhead.

Copilot

Pull request overview

This PR adds a comprehensive test for CUDA Interop Graphics (CIG) inference to demonstrate proper usage patterns when working with external D3D12 resources. The key change is modifying context management to avoid calling cudaSetDevice when a CUDA context already exists, which prevents creating unwanted new contexts during CIG workflows.

Changes:

Added FullInferenceWithExternalMemoryCIG test demonstrating CIG context usage with external memory import
Modified context management in nv_provider_factory.cc to check for existing CUDA contexts before calling cudaSetDevice
Migrated test API calls from ort_api_ to ort_interop_api_ for external resource operations

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc	Added CudaDriverLoader helper class, renamed test fixture to NvExecutionProviderExternalResourceImporterTest, migrated to interop API, and added comprehensive CIG inference test
onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc	Modified ImportMemory, ImportSemaphore, and CreateSyncStreamForDevice to check for existing CUDA context before calling cudaSetDevice

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

praneshgo · 2026-01-15T10:02:55Z

Functionally, the change looks good to me.

praneshgo · 2026-01-15T10:09:02Z

@gaugarg-nv @ankan-ban can you please review this as well? Thanks.

ankan-ban · 2026-01-15T10:45:59Z

Thanks, Max, for writing the test. It looks good. It's nice to see most of the interop functionality nicely abstracted out behind the new ORT interop APIs.

There are just a couple of things that are still Nvidia specific in the test (maybe consider abstracting out these too in the future with more additions to ORT APIs):

Use of CUDA APIs by the app to create the context before invoking ORT (this requires a command queue handle that would share same TSG with the cuda context. The same TSG is a requirement to enable CiG on our hardware).
Use of nv-specific session options (kUserComputeStream, kHasUserComputeStream, kMaxSharedMemSize). Maybe having a mechanism for passing the generic ORT-stream object to session.run() makes even more sense. The shared memory size limit is again required for running in CiG mode - and hopefully if we have a generic way of doing "1" above - it can allow the EP to automatically set the correct value depending on the GPU.

I think resolving the above 2 would allow app developers to write truly IHV agonistic code that runs everywhere (e.g, using DX12 APIs for allocating resources, doing any pre/post processing and generic ORT APIs to run the model).

gedoensmax · 2026-01-15T11:35:50Z

Thanks @ankan-ban for the review.

DO you consider this as a big blocker ? I though of this small driver API usage as OK for an ISV to integrate, let me know if you think different.
Fully agree ! I would love to set the kMaxSharedMemSize implicit but i did not found a way to check for the max supported shared mem size based on the currently pushed context. Having a stream as input on Ort::Session::Run would be great and @skottmckay has mentioned that on another thread, can you let us know what the status on this is ?

@nieubank can you help tag this for 1.24 since we would like to make sure that this goes in with the newly added Interop API. I can take care of rebasing to main and accepting some of the copilot comments if that's all.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

cmake/onnxruntime_providers_nv.cmake

skottmckay · 2026-01-19T03:39:53Z

#26988 added the stream in RunOptions and is now merged.

ankan-ban · 2026-01-19T11:13:46Z

>DO you consider this as a big blocker ? I though of this small driver API usage as OK for an ISV to integrate, let me know if you think different
Agree that it's not a big blocker - but after #26988 it seems the only NV specific thing that the app needs to do. Agree that it's likely not too much for ISVs to integrate.

gedoensmax · 2026-01-19T15:02:49Z

I rebased on the refined structs and started providing the stream as run option. There are missing changes to support this for CiG but we are tracking this internally.

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cmake/onnxruntime_providers_nv.cmake

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider_utils.h

onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider_utils.h

onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

onnxruntime/core/providers/nv_tensorrt_rtx/nv_cuda_call.cc

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc

yuslepukhin · 2026-01-23T18:58:01Z

@gedoensmax Please, review, address, comment on, resolve Copilot comments. They are often useful.

gedoensmax · 2026-01-26T08:44:21Z

@yuslepukhin all the comments i resolved i already implemented from copilot. I missed the one on raw tensor size handling and made the according change.

yuslepukhin · 2026-01-26T18:35:58Z

It looks good. You will need to resolve conflicts. It will also require to mark all the comments as resolved.

…porter_cig # Conflicts: # onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

chilo-ms · 2026-01-27T21:48:57Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-01-27T21:49:17Z

Azure Pipelines successfully started running 4 pipeline(s).

gedoensmax mentioned this pull request Jan 8, 2026

[TRTRTX EP] Implement OrtExternalResourceImporter for D3D12-CUDA interop #26829

Draft

yuslepukhin requested a review from Copilot January 13, 2026 23:58

Copilot started reviewing on behalf of yuslepukhin January 13, 2026 23:59 View session

Copilot AI reviewed Jan 14, 2026

View reviewed changes

nieubank added this to the 1.24.0 milestone Jan 15, 2026

gedoensmax force-pushed the maximilianm/nv_ext_importer branch from ca44c43 to 79d020e Compare January 16, 2026 23:33

gedoensmax changed the base branch from nieubank/nv_ext_importer to main January 16, 2026 23:33

gedoensmax changed the title ~~Add test for CIG inference~~ [TRT RTX EP] Add support for D3D12 external resourrce import Jan 16, 2026

gedoensmax requested a review from Copilot January 16, 2026 23:34

Copilot AI reviewed Jan 16, 2026

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Show resolved Hide resolved

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Show resolved Hide resolved

cmake/onnxruntime_providers_nv.cmake Outdated Show resolved Hide resolved

gedoensmax changed the title ~~[TRT RTX EP] Add support for D3D12 external resourrce import~~ [TRT RTX EP] Add support for D3D12 external resource import Jan 16, 2026

gedoensmax added 4 commits January 19, 2026 14:25

NV TRT RTX EP external graphics interop

869e964

address copilot comments

7324c3c

rebase on top of main including microsoft#27040

1906f37

use a run option provided stream

468eff1

gedoensmax force-pushed the maximilianm/nv_ext_importer branch from 3d012ef to 468eff1 Compare January 19, 2026 15:02

skottmckay reviewed Jan 20, 2026

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Show resolved Hide resolved

github-advanced-security bot found potential problems Jan 21, 2026

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Fixed Show fixed Hide fixed

yuslepukhin reviewed Jan 21, 2026

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Outdated Show resolved Hide resolved

yuslepukhin reviewed Jan 21, 2026

View reviewed changes

onnxruntime/test/providers/nv_tensorrt_rtx/nv_external_resource_importer_test.cc Show resolved Hide resolved

yuslepukhin reviewed Jan 21, 2026

View reviewed changes

onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc Outdated Show resolved Hide resolved

yuslepukhin requested a review from Copilot January 22, 2026 01:06

Copilot started reviewing on behalf of yuslepukhin January 22, 2026 01:07 View session

Copilot AI reviewed Jan 22, 2026

View reviewed changes

address review comments

b4bf7c0

gedoensmax added 2 commits January 26, 2026 09:44

address copilot comments

d067dd8

lintrunner

a8e0d70

nieubank previously approved these changes Jan 26, 2026

View reviewed changes

gedoensmax added 2 commits January 27, 2026 22:32

Merge remote-tracking branch 'origin/main' into maximilianm/nv_ext_im…

e0d18ff

…porter_cig # Conflicts: # onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc

fix TO_CMAKE_PATH order for CUDATollkit_ROOT

79b25f5

gedoensmax dismissed nieubank’s stale review via 79b25f5 January 27, 2026 21:45

[TRT RTX EP] Add support for D3D12 external resource import #26948

Are you sure you want to change the base?

[TRT RTX EP] Add support for D3D12 external resource import #26948

Uh oh!

Conversation

gedoensmax commented Jan 8, 2026

Uh oh!

nieubank commented Jan 8, 2026

Uh oh!

gedoensmax commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

praneshgo commented Jan 15, 2026

Uh oh!

praneshgo commented Jan 15, 2026

Uh oh!

ankan-ban commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gedoensmax commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skottmckay commented Jan 19, 2026

Uh oh!

ankan-ban commented Jan 19, 2026

Uh oh!

gedoensmax commented Jan 19, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yuslepukhin commented Jan 23, 2026

Uh oh!

gedoensmax commented Jan 26, 2026

Uh oh!

yuslepukhin commented Jan 26, 2026

Uh oh!

chilo-ms commented Jan 27, 2026

Uh oh!

azure-pipelines bot commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

ankan-ban commented Jan 15, 2026 •

edited

Loading