Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(autoware_pointcloud_preprocessor): redesign concatenate and time sync node #8300

Open
wants to merge 131 commits into
base: main
Choose a base branch
from

Conversation

vividf
Copy link
Contributor

@vividf vividf commented Aug 1, 2024

Description

This PR solved the issue #6832.
Previous designs have some issues concatenating the pointcloud correctly, therefore, this PR redesigns the logic of the concatenate node in order to handle the edge cases like pointcloud delay or pointcloud drop.

Changes

  • new algorithm
  • diagnostic message
  • parameter file, launch file, schema
  • unit test: test the logic of the cloud_collector and combine_cloud_handler
  • component test: testing edge cases (pointcloud dropped, delayed)

A more detailed description of the algorithm is on the Readme page.https://github.com/vividf/autoware.universe/blob/feature/redesign_concatenate_and_time_sync_node/sensing/autoware_pointcloud_preprocessor/docs/concatenate-data.md

Related links

Parent Issue:

  • Link

How was this PR tested?

Unit test and Component test for concatenate node

# build autoware_pointcloud_preprocessor
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-up-to autoware_pointcloud_preprocessor

# test autoware_pointcloud_preprocessor
colcon test --packages-select autoware_pointcloud_preprocessor --event-handlers console_cohesion+

Tested with sample rosbag

Please change the sample_sensor_kit_launch: autowarefoundation/sample_sensor_kit_launch#108

# Terminal 1
ros2 launch autoware_launch logging_simulator.launch.xml map_path:=$HOME/autoware_map/sample-map-rosbag vehicle_model:=sample_vehicle sensor_model:=sample_sensor_kit

# Terminal 2
ros2 bag play sample-rosbag_0.db3

The result provided by @mojomex below #8300 (comment)

Tested with vehicle 1

data: TIER4_INTERNAL_LINK

modify the config file concatenate_and_time_sync_node.param.yaml

/**:
  ros__parameters:
    debug_mode: false
    has_static_tf_only: false
    rosbag_length: 20.0
    maximum_queue_size: 5    
    timeout_sec: 0.2
    is_motion_compensated: true
    publish_synchronized_pointcloud: true
    keep_input_frame_in_synchronized_pointcloud: true
    publish_previous_but_late_pointcloud: false
    synchronized_pointcloud_postfix: pointcloud
    input_twist_topic_type: twist
    input_topics: [
                    "/sensing/lidar/left/pointcloud_before_sync",
                    "/sensing/lidar/right/pointcloud_before_sync",
                    "/sensing/lidar/top/pointcloud_before_sync"
                ]
    output_frame: base_link
    matching_strategy:
      type: advanced
      lidar_timestamp_offsets: [0.0, 0.035, 0.073]
      lidar_timestamp_noise_window: [0.01, 0.01, 0.01]
# Terminal 1
ros2 launch autoware_pointcloud_preprocessor concatenate_and_time_sync_node.launch.xml

# Terminal 2
ros2 bag play rosbag2_2024_07_11-17_54_04_0.db3

Note

  • This bag can test the scenario when the pointclouds are dropped.
  • Set the rosbag_replay parameter to True, if user is replaying the rosbag without relaunching the node.

Tested with vehicle 2

data: TIER4_INTERNAL_LINK
modify the config file concatenate_and_time_sync_node.param.yaml

/**:
  ros__parameters:
      debug_mode: false
      has_static_tf_only: false
      rosbag_length: 60.0
      maximum_queue_size: 5
      timeout_sec: 0.2
      is_motion_compensated: false
      publish_synchronized_pointcloud: false
      keep_input_frame_in_synchronized_pointcloud: true
      publish_previous_but_late_pointcloud: false
      synchronized_pointcloud_postfix: pointcloud
      input_twist_topic_type: twist
      input_topics: [
                    "/sensing/lidar/left_lower/pointcloud",
                    "/sensing/lidar/left_upper/pointcloud",
                    "/sensing/lidar/front_lower/pointcloud",
                    "/sensing/lidar/front_upper/pointcloud",
                    "/sensing/lidar/right_upper/pointcloud",
                    "/sensing/lidar/right_lower/pointcloud",
                    "/sensing/lidar/rear_lower/pointcloud",
                    "/sensing/lidar/rear_upper/pointcloud"
                  ]
      output_frame: base_link
      matching_strategy:
        type: advanced
        lidar_timestamp_offsets: [0.0, 0.0, 0.025, 0.028, 0.026, 0.05, 0.075, 0.076]
        lidar_timestamp_noise_window: [0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]
# Terminal 1
ros2 launch autoware_pointcloud_preprocessor concatenate_and_time_sync_node.launch.xml

# Terminal 2
ros2 bag play rosbag2_2024_10_23-12_21_48_0.db3

image-20241023-030032-20241023-055053

Result

TIER4_INTERNAL_LINK

Time comparison (vehicle 1 bag)

From last arrived pointcloud to publish concatenate pointcloud (include publishing)

Before (move the toc to the beginning of the cloudcallback function)

Minimum Maximum Average
Time 6.82 90.986 13.70

before
Note that the huge latency is because a pointcloud is dropped.

After

Minimum Maximum Average
Time 0.82 73.004 11.25

after

By setting is_motion_compensated to false

Minimum Maximum Average
Time 0.69 70.614 9.16

image-20240730-070949

Notes for reviewers

locking logic (mutex) might be an important part of double-checking :)

Interface changes

None.

Effects on system behavior

None.

@vividf vividf self-assigned this Aug 1, 2024
@github-actions github-actions bot added type:documentation Creating or refining documentation. (auto-assigned) component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) labels Aug 1, 2024
Copy link

github-actions bot commented Aug 1, 2024

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

@vividf vividf changed the title Feature/redesign concatenate and time sync node feature(autoware_pointcloud_preprocessor): redesign concatenate and time sync node Aug 1, 2024
@vividf vividf added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Aug 1, 2024
@vividf vividf changed the title feature(autoware_pointcloud_preprocessor): redesign concatenate and time sync node feat(autoware_pointcloud_preprocessor): redesign concatenate and time sync node Aug 1, 2024
Copy link

codecov bot commented Aug 1, 2024

Codecov Report

Attention: Patch coverage is 70.37037% with 136 lines in your changes missing coverage. Please review.

Project coverage is 29.97%. Comparing base (ddfacd3) to head (f278aea).
Report is 34 commits behind head on main.

Files with missing lines Patch % Lines
...oncatenate_data/concatenate_and_time_sync_node.cpp 73.26% 28 Missing and 26 partials ⚠️
...processor/src/concatenate_data/cloud_collector.cpp 49.15% 26 Missing and 4 partials ⚠️
...c/concatenate_data/collector_matching_strategy.cpp 46.80% 20 Missing and 5 partials ⚠️
...sor/src/concatenate_data/combine_cloud_handler.cpp 83.80% 11 Missing and 12 partials ⚠️
..._preprocessor/concatenate_data/cloud_collector.hpp 33.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8300      +/-   ##
==========================================
+ Coverage   29.76%   29.97%   +0.20%     
==========================================
  Files        1444     1453       +9     
  Lines      108686   108993     +307     
  Branches    42664    42878     +214     
==========================================
+ Hits        32352    32667     +315     
+ Misses      73150    73101      -49     
- Partials     3184     3225      +41     
Flag Coverage Δ *Carryforward flag
differential 26.69% <70.37%> (?)
differential-cuda 26.69% <70.37%> (?)
total 29.78% <ø> (+0.01%) ⬆️ Carriedforward from 2efabf4

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vividf
Copy link
Contributor Author

vividf commented Aug 2, 2024

@kminoda
This PR includes the unit test and component test for the concatenate node.
Would be nice if you could check the component test (especially for the empty pointcloud case that caused the problem before)

@kminoda
Copy link
Contributor

kminoda commented Aug 8, 2024

@vividf Thanks. Let me do that later

@vividf
Copy link
Contributor Author

vividf commented Dec 9, 2024

@drwnz @mojomex
The naive approach is implemented in commit aa89a1f.

Result of unsynchronized LiDARs
Screenshot from 2024-12-09 11-34-50

@vividf vividf requested a review from mojomex December 9, 2024 03:11
Copy link
Contributor

@mojomex mojomex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the naive and "strict" approach are mutually exclusive, the code should be structured that only the parameters/variables of the enabled one exist.

Doing this leads to:

  • no unused parameters in the config --> users will make less config mistakes
  • less documentation needed
  • no god object --> easier to debug, less chance of bugs, more maintainable

Instead of having all parameters in a flat hierarchy and documenting which of them are ignored when, you can make parameters tree-structured instead:

In the case of the naive approach:

# common parameters
debug_mode: false
[...]
approach:
  type: naive
  # this approach has no other parameters

In the case of the strict approach:

# common parameters
debug_mode: false
[...]
approach:
  type: strict
  lidar_timestamp_offsets: [...]
  lidar_timestamp_noise_window: [...]

In ROS2, each hierarchy level becomes a . in the parameter name --> approach.type and so on.

When parsing these, you can use polymorphism

class SchemaError : public std::exception {
  const char * what() const noexcept override {
    return "JSON schema violation";
  }
};

class Approach {
  virtual std::optional<CloudCollector> match_cloud_to_collector(...) = 0;
};

struct ApproachNaive : public Approach {
  std::optional<CloudCollector> match_cloud_to_collector(...) override;
};

struct ApproachStrict : public Approach {
  std::vector<double> lidar_timestamp_offsets;
  std::vector<double> lidar_timestamp_noise_window;

  std::optional<CloudCollector> match_cloud_to_collector(...) override;
};

std::shared_ptr<Approach> parse_approach(rclcpp::Node & node) {
  auto type = node.declare_parameter<std::string>("approach.type");
  if (type == "naive") {
    return std::make_shared<ApproachNaive>();
  }
  if (type == "strict") {
    return std::make_shared<ApproachStrict>(
        node.declare_parameter<std::vector<double>>("approach.lidar_timestamp_offsets"),
        node.declare_parameter<std::vector<double>>("approach.lidar_timestamp_noise_window")
    );
  }

  throw SchemaError{};
}

to declare only the needed parameters. This is only demonstration code, I'm sure there are nicer ways to write this.

From then, instead of if/else in multiple places, you can use polymorphism so that the calling code can just call apporach.my_method without having to know which approach is used.

@vividf
Copy link
Contributor Author

vividf commented Dec 11, 2024

@mojomex
Thanks for your advice and detailed explanation!
They are fixed in fb9bebc

@vividf vividf requested a review from mojomex December 11, 2024 10:53
@mojomex mojomex added the run:deploy-docs Mark for deploy-docs action generation. (used-by-ci) label Dec 20, 2024
Comment on lines 64 to 67
double reference_timestamp_min_{0.0};
double reference_timestamp_max_{0.0};
double arrival_timestamp_{0.0}; // This is used for the naive approach
bool debug_mode_;
Copy link
Contributor

@mojomex mojomex Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code has the same problem I wanted to point out with my last comment:

The states of two separate and mutually exclusive approaches both end up here. At runtime, only one will be used but both exist, which is

  • hard for other developers to understand/maintain
  • bugprone if someone does not read this comment

Please refactor (maybe into CloudCollector (base class), CloudCollectorNaive and CloudCollectorAdvanced, or CloudCollector(const Strategy & strategy)) and have one common interface for both.
Same goes for set_arrival_timestamp and set_reference_timestamp and the related member functions above.
Personally I would probably go for the CloudCollector(const Strategy & strategy) approach since you already have the strategy classes.

Copy link
Contributor Author

@vividf vividf Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mojomex
Thanks for your suggestion, I fixed them in another way in commit 1dd7fbc.

If you think this is not a good approach then let's have some discussion afterward!
Thank you!!!

Copy link
Contributor

@mojomex mojomex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

There's still a bit of unused state in the cloud_collector, but refactoring so it references the chosen strategy should be quite quick.

Also, I left some small fixes for documentation rendering. see docs here or run locally via:

cd src/universe/autoware.universe
mkdocs serve

For running locally, you might also need to do this setup first.

@vividf vividf requested a review from mojomex December 23, 2024 14:29
std::shared_ptr<CloudCollector> & collector, const MatchingParams & matching_params) = 0;

protected:
CollectorStrategyType strategy_type_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, please remove. Please make sure once more that you are implementing polymorphism properly:

The base class shall NEVER have any dependency on specific child classes/types/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
commit fd7db92 fixed the problem.

{
double timestamp{0.0};
double noise_window{0.0};
CollectorStrategyType strategy_type{CollectorStrategyType::Naive};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not polymorphic - the problem discussed in my previous comment has just been moved into this struct:

noise_window is just a dead field when the strategy is Naive.
Please make sure all issues in the original comment are addressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand! I was considering the cost of using dynamic_pointer_cast.
commit fd7db92 fixed the problem.
Thanks for the review!!!!


bool topic_miss = false;

int concatenated_cloud_status = 1; // Status of concatenated cloud, 1: success, 0: failure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a bool and named in such a way that no explanatory comment is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in f278aea


int concatenated_cloud_status = 1; // Status of concatenated cloud, 1: success, 0: failure
for (const auto & topic : params_.input_topics) {
int cloud_status = 1; // Status of each lidar's pointcloud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in f278aea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:sensing Data acquisition from sensors, drivers, preprocessing. (auto-assigned) run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) run:deploy-docs Mark for deploy-docs action generation. (used-by-ci) tag:require-cuda-build-and-test type:documentation Creating or refining documentation. (auto-assigned)
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants