Proposal for CUDA upgrade #5300
amadeuszsz
started this conversation in
Design
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
Currently Autoware supports strictly defined versions of CUDA-related libraries. These versions of libraries became relatively old and block future development. One of the example is ongoing development of 3D LiDAR semantic segmentation for Autoware which requires TensorRT
10.0
+. Another reason is that TensorRT support for Ubuntu24.04
was introduced in10.4.0
and we might need to align with roadmap for ROS 2 Jazzy Autoware update.Proposal
We currently support:
CUDA=12.3
CUDNN=8.9.5.29-1+cuda12.2
TensorRT=8.6.1.6-1+cuda12.0
Version upgrade could be done based on two visions.
Option 1 - JetPack support
This week (when writing this proposal) there was a JetPack 6.1 release which brings upgrades for CUDA-related libraries. Fortunately, libraries versions meet our requirements and from OSS community perspective it could be nice if Autoware align with JetPack release.
In addition, looking into last update, there was a valuable comment regarding choosing specific libraries version which reflects experience for edge devices users.
Finally, it gives us:
CUDA=12.6
CUDNN=9.3.0.75-1+cuda12.6
TensorRT=10.3.0.26-1+cuda12.5
Option 2 - latest versions
Simply go to the latest releases:
CUDA=12.6
CUDNN=9.6.0.74-1
TensorRT=10.7.0.23-1+cuda12.6
Summary
Starting with CUDA, choosing version forces users to have this required version at least. CUDA 12.6 comes with driver 560, which is latest version and might be not tested yet properly. For that reason we could for now go with CUDA 12.4, but I'm open for suggestions here.
TensorRT could be latest version as new release comes with new runtime strategies, better optimization and bug fixes (see changelog for more information.
CUDNN strongly depends on available binaries for current TensorRT. CUDNN 9.0+ consists of API breaking changes and with the fact there is no TensorRT libraries available for CUDNN 9.0+, we should stick for now with CUDNN 8.9.
This finally gives us:
CUDA=12.4
CUDNN=8.9.7.29-1+cuda12.2
TensorRT=10.7.0.23-1+cuda12.6
What's next?
There is no ideal solution but what we can do is to follow these steps:
10.0
+ old API is removed. Related packages:autoware_tensorrt_common
autoware_image_projection_based_fusion
autoware_lidar_apollo_instance_segmentation
autoware_lidar_centerpoint
autoware_lidar_transfusion
autoware_shape_estimation
autoware_tensorrt_classifier
autoware_tensorrt_yolox
autoware_traffic_light_classifier
autoware_traffic_light_fine_detector
autoware_tensorrt_rtmdet
- new package, waits for mergeThere might be conflict between new Autoware dependencies and edge devices user with JetPack up toNew API was introduced in TensorRT 8.5.2 which is already in JetPack 5.1.2. I assume it is fair to support only new API.6.0
. I suggest to refactor all packages to usetensorrt_common
API call for model inference, where we add multiple macros to choose API call based on installed TensorRT version.This way, we will know if we can support edge devices and if none of the packages have issues with the upgraded libraries. (updated
amd64.env
andarm64.env
are compatible). Secondly, we will be prepared for future ROS 2 Jazzy upgrade.Please feel free to share your thoughts. This upgrade may affect contributors' work, therefore we would like to find the best solution for all of us.
Beta Was this translation helpful? Give feedback.
All reactions