Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(autoware_pointcloud_preprocessor): redesign concatenate and time sync node #8300

Open
wants to merge 85 commits into
base: main
Choose a base branch
from

Conversation

vividf
Copy link
Contributor

@vividf vividf commented Aug 1, 2024

Description

This PR solved the issue #6832.
Previous designs have some issues concatenating the pointcloud correctly, therefore, this PR redesigns the logic of the concatenate node in order to handle the edge cases like pointcloud delay or pointcloud drop.

Changes

  • new algorithm
  • diagnostic message
  • parameter file, launch file, schema
  • unit test: test the logic of the cloud_collector and combine_cloud_handler
  • component test: testing edge cases (pointcloud dropped, delayed)

A more detailed description of the algorithm is on the Readme page.https://github.com/vividf/autoware.universe/blob/feature/redesign_concatenate_and_time_sync_node/sensing/autoware_pointcloud_preprocessor/docs/concatenate-data.md

Related links

Parent Issue:

  • Link

How was this PR tested?

Unit test and Component test for concatenate node

# build autoware_pointcloud_preprocessor
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-up-to autoware_pointcloud_preprocessor

# test autoware_pointcloud_preprocessor
colcon test --packages-select autoware_pointcloud_preprocessor --event-handlers console_cohesion+

launch

Tested with xx1
data: TIER4_INTERNAL_LINK

# Terminal 1
ros2 launch autoware_pointcloud_preprocessor concatenate_and_time_sync_node.launch.xml

# Terminal 2
ros2 bag play rosbag2_2024_07_11-17_54_04_0.db3

Note

This bag can test the scenario when the pointclouds are dropped.

Tested with xx2 gen2
data: TIER4_INTERNAL_LINK
modify the config file concatenate_and_time_sync_node.param.yaml

/**:
  ros__parameters:
      debug_mode: false
      has_static_tf_only: false
      rosbag_replay: true
      rosbag_length: 60.0
      maximum_queue_size: 5
      timeout_sec: 0.2
      is_motion_compensated: false
      publish_synchronized_pointcloud: false
      keep_input_frame_in_synchronized_pointcloud: true
      publish_previous_but_late_pointcloud: false
      synchronized_pointcloud_postfix: pointcloud
      input_twist_topic_type: twist
      input_topics: [
                      "/sensing/lidar/left_lower/pointcloud",
                      "/sensing/lidar/left_upper/pointcloud",
                      "/sensing/lidar/front_lower/pointcloud",
                      "/sensing/lidar/front_upper/pointcloud",
                      "/sensing/lidar/right_upper/pointcloud",
                      "/sensing/lidar/right_lower/pointcloud",
                      "/sensing/lidar/rear_lower/pointcloud",
                      "/sensing/lidar/rear_upper/pointcloud"
                  ]
      output_frame: base_link
      lidar_timestamp_offsets: [0.0, 0.0, 0.025, 0.028, 0.026, 0.05, 0.075, 0.076]
      lidar_timestamp_noise_window: [0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]
# Terminal 1
ros2 launch autoware_pointcloud_preprocessor concatenate_and_time_sync_node.launch.xml

# Terminal 2
ros2 bag play rosbag2_2024_10_23-12_21_48_0.db3

image-20241023-030032-20241023-055053

Result

TIER4_INTERNAL_LINK

Time comparison (xx1 bag)

From last arrived pointcloud to publish concatenate pointcloud (include publishing)

Before (move the toc to the beginning of the cloudcallback function)

Minimum Maximum Average
Time 6.82 90.986 13.70

before
Note that the huge latency is because a pointcloud is dropped.

After

Minimum Maximum Average
Time 0.82 73.004 11.25

after

By setting is_motion_compensated to false

Minimum Maximum Average
Time 0.69 70.614 9.16

image-20240730-070949

Notes for reviewers

locking logic (mutex) might be an important part of double-checking :)

Interface changes

None.

Effects on system behavior

None.

@vividf vividf self-assigned this Aug 1, 2024
@github-actions github-actions bot added type:documentation Creating or refining documentation. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) labels Aug 1, 2024
Copy link

github-actions bot commented Aug 1, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@vividf vividf changed the title Feature/redesign concatenate and time sync node feature(autoware_pointcloud_preprocessor): redesign concatenate and time sync node Aug 1, 2024
@vividf vividf added the tag:run-build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Aug 1, 2024
@vividf vividf changed the title feature(autoware_pointcloud_preprocessor): redesign concatenate and time sync node feat(autoware_pointcloud_preprocessor): redesign concatenate and time sync node Aug 1, 2024
Copy link

codecov bot commented Aug 1, 2024

Codecov Report

Attention: Patch coverage is 79.53964% with 80 lines in your changes missing coverage. Please review.

Project coverage is 28.10%. Comparing base (9e32e92) to head (798cbd6).
Report is 89 commits behind head on main.

Files with missing lines Patch % Lines
...oncatenate_data/concatenate_and_time_sync_node.cpp 73.50% 28 Missing and 25 partials ⚠️
...sor/src/concatenate_data/combine_cloud_handler.cpp 84.02% 11 Missing and 12 partials ⚠️
...processor/src/concatenate_data/cloud_collector.cpp 91.11% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8300      +/-   ##
==========================================
+ Coverage   26.19%   28.10%   +1.90%     
==========================================
  Files        1302     1339      +37     
  Lines       96935   100091    +3156     
  Branches    39172    40494    +1322     
==========================================
+ Hits        25395    28128    +2733     
- Misses      68959    71800    +2841     
+ Partials     2581      163    -2418     
Flag Coverage Δ *Carryforward flag
differential 25.93% <79.53%> (?)
total 27.85% <ø> (+1.65%) ⬆️ Carriedforward from 460b467

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vividf
Copy link
Contributor Author

vividf commented Aug 2, 2024

@kminoda
This PR includes the unit test and component test for the concatenate node.
Would be nice if you could check the component test (especially for the empty pointcloud case that caused the problem before)

@kminoda
Copy link
Contributor

kminoda commented Aug 8, 2024

@vividf Thanks. Let me do that later

"maximum_queue_size": {
"type": "integer",
"default": 5,
"description": "Maximum size of the queue."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe in more detail:

  • which queue does this refer to - the number of waiting input pointclouds, the number of cloud collector groups, etc.?
  • Effects on maximum latency of the produced concatenated cloud. If this refers to the number of groups possible, wouldn't the maximum latency be 500 ms?

"properties": {
"maximum_queue_size": {
"type": "integer",
"default": 5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check what effects this has on latency (see comment one line below). If this parameter refers to the number of groups, the latency would be up to 500 ms, which is not good for A/D. Setting this default to 2 would be safer imho.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mojomex
The maximum_queue_size is defined for most of the nodes in pointcloud preprocessor (and the value is 5).
I think if we modify the number from 5 to 2, perhaps we also need to consider whether we should modify the value in other nodes.

btw, maximum_queue_size parameter is for the qos setting instead of the group (collector?)

Copy link
Contributor

@mojomex mojomex Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I just discussed with @drwnz in person, setting it to 1 is best. This reduces latency to a minimum. Just starting with this node for now is okay.

Comment on lines 203 to 194
concatenate_cloud_ptr =
std::make_shared<sensor_msgs::msg::PointCloud2>(*transformed_delay_compensated_cloud_ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reserving the total number of bytes needed in the point cloud's data field could spare us a couple of resize/re-allocation operations.
Something like this perhaps:

auto n_total_points = // sum of the number of points of all pointclouds
concatenate_cloud_ptr->data.reserve(concatenate_cloud_ptr->point_step * n_total_points)

Copy link
Contributor Author

@vividf vividf Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mojomex
Thanks for the suggestion!!!
fixed them in 49b54d4

Copy link
Contributor

@mojomex mojomex Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is not quite what I meant: with your new implementation, the size of the additional single pointlcoud is reserved but not the size of all poinclouds combined.

What I wanted to see is

  1. initialize concatenate_cloud_ptr with an empty pointcloud (*1) before the for loop starts
  2. reserve data such that it's (pseudocode) point_step * sum([cloud.data.size() for cloud in cloud_map.values()] (so if you have clouds with sizes 1, 2 and 3 bytes, it would reserve 6 bytes at once before the for loop starts
  3. then concatenate the clouds to concatenate_cloud_ptr in the loop

The manual initialization of all fields should not be necessary (please revert to how it was before, just make_shared).

*1: when calling pcl::concatenatePointCloud it will sadly throw away the empty initial pointcloud... Unless we want to do something sketchy (or implement concatenation ourselves) we can't proceed with this comment.

Feel free to

  1. revert and resolve this conversation
  2. and add a comment above the for loop saying that this could be made faster by reserving space first.

Copy link
Contributor Author

@vividf vividf Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mojomex
Thanks, understood!
Fixed in the commit 782228f

Comment on lines 210 to 215
// convert to original sensor frame if necessary
bool need_transform_to_sensor_frame = (cloud->header.frame_id != output_frame_);
if (keep_input_frame_in_synchronized_pointcloud_ && need_transform_to_sensor_frame) {
sensor_msgs::msg::PointCloud2::SharedPtr transformed_cloud_ptr_in_sensor_frame(
new sensor_msgs::msg::PointCloud2());
pcl_ros::transformPointCloud(
(std::string)cloud->header.frame_id, *transformed_delay_compensated_cloud_ptr,
*transformed_cloud_ptr_in_sensor_frame, tf_buffer_);
transformed_cloud_ptr_in_sensor_frame->header.stamp = oldest_stamp;
transformed_cloud_ptr_in_sensor_frame->header.frame_id = cloud->header.frame_id;
topic_to_transformed_cloud_map[topic] = transformed_cloud_ptr_in_sensor_frame;
} else {
transformed_delay_compensated_cloud_ptr->header.stamp = oldest_stamp;
transformed_delay_compensated_cloud_ptr->header.frame_id = output_frame_;
topic_to_transformed_cloud_map[topic] = transformed_delay_compensated_cloud_ptr;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, these will only be published if the publish_synchronized_pointcloud parameter is true. However, this part of the code here is always running (at the least, the original pointclouds are kept as-is and inserted into the map, thus causing increased memory usage).
Only creating this map if the feature is enabled (e.g. std::optional<std::unordered_map<...>>) and not keeping the original pointclouds in it if disabled, would be preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!!! I didn't notice that in the old design!
Fix the logic in 52ed5ed

"default": [],
"description": "List of input point cloud topics."
},
"output_frame": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it comes to frames empty ones are usually not allowed in ROS. Can you add a restriction here?

Copy link
Contributor Author

@vividf vividf Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Addressed the schema-related suggestions in b6700a9

@knzo25
Copy link
Contributor

knzo25 commented Sep 30, 2024

@vividf
At the beginning of this project I asked for:

  • List of expected behavior under border / non-expected conditions
  • List of rosbags that we can use to validate the behavior under several conditions, including normal behavior, packet drop, etc.

Can you provide / right the previous items?
For internal data, please write them in an internal page / links

@vividf
Copy link
Contributor Author

vividf commented Oct 1, 2024

@knzo25 @mojomex rosbags are provided in the description :)

@vividf vividf requested review from knzo25 and mojomex October 1, 2024 11:26
@mojomex
Copy link
Contributor

mojomex commented Oct 3, 2024

@vividf I built and ran with ThreadSanitizer (TSan).

Setup

cd ~/autoware
# Autoware is already built, just build the preprocessor with tsan
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=RelWithDebInfo --packages-select autoware_pointcloud_preprocessor --mixin tsan
source install/setup.bash
ros2 launch autoware_pointcloud_preprocessor concatenate_and_time_sync_node.launch.xml

⚠️ Running it in a container with TSan enabled won't work

In a previous run, I recorded all necessary inputs into a rosbag which I replayed.

Results

There is one memory safety issue (heap-use-after-free) and one other warning (repeating), both related to delete_collector() (TSan log here):

In CloudCollector::concatenate_callback(), the CloudCollector is destroyed (collectors_.erase(it);)
and then, the lock holding the now de-allocated mutex_ is destroyed after concatenate_callback() (destructor called), trying to unlock that non-existing mutex.

Suggested solution

Because CloudCollector::concatenate_callback() is synchronously called from PointCloudConcatenateDataSynchronizerComponent::cloud_callback() via CloudCollector::process_cloud(), you can move the deletion from the list into PointCloudConcatenateDataSynchronizerComponent::cloud_callback().

@vividf
Copy link
Contributor Author

vividf commented Oct 22, 2024

@mojomex
Thank you for the guidance on using TSan to check thread safety! I resolved the issue in this commit, which is the code we tested together a few weeks ago without encountering any errors.

@vividf
Copy link
Contributor Author

vividf commented Oct 23, 2024

This PR was tested on the X2 bench and has proven to be better than the current concatenation algorithm in terms of speed and logic. It would be great if we could merge this PR soon. Thanks!
image-20241023-030032-20241023-055053

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) tag:require-cuda-build-and-test tag:run-build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) type:documentation Creating or refining documentation. (auto-assigned)
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants