Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior difference for multiple TRANSIENT_LOCAL publishers #61

Closed
eliasdc opened this issue Jan 11, 2022 · 4 comments
Closed

Behavior difference for multiple TRANSIENT_LOCAL publishers #61

eliasdc opened this issue Jan 11, 2022 · 4 comments

Comments

@eliasdc
Copy link

eliasdc commented Jan 11, 2022

There is a difference between the behavior of vanilla ROS2 and using Zenoh for handling communication between hosts. With 2 or more publishers (e.g. /tf_static) only one message will be received by late subscribers when using Zenoh while with the vanilla you would receive one message from each publisher (example with /tf_static uses depth of one https://github.com/ros2/geometry2/blob/bd125c5af32e2c63c70b7e0ba88e57094714b9b8/tf2_ros/include/tf2_ros/qos.hpp#L66).

In my setup each physical device publishes its own static transforms which makes zenoh unusable without continuously republishing the static transforms.

logs filtered on /tf_static:

[2022-01-11T16:38:05Z DEBUG zplugin_dds::dds_mgt] Discovered DDS publication cbdc1001ecc47db2d8d6102c00001603 from Participant cbdc1001ecc47db2d8d6102c000001c1 on rt/tf_static with type tf2_msgs::msg::dds_::TFMessage_ (keyless: true)
[2022-01-11T16:38:05Z DEBUG zplugin_dds] Discovered DDS Writer on rt/tf_static: cbdc1001ecc47db2d8d6102c00001603
[2022-01-11T16:38:05Z DEBUG zenoh::net::routing::resource] Register resource /rt/tf_static
[2022-01-11T16:38:05Z DEBUG zplugin_dds] Caching publications for TRANSIENT_LOCAL Writer on resource /rt/tf_static with history 1
[2022-01-11T16:38:05Z DEBUG zenoh::net::routing::resource] Register resource /@dds_pub_cache/B6F8FB5759A44D39A2FB181081A96C63/rt/tf_static
[2022-01-11T16:38:05Z DEBUG zenoh::net::routing::queries] Register queryable /@dds_pub_cache/B6F8FB5759A44D39A2FB181081A96C63/rt/tf_static (face: Face{0, B6F8FB5759A44D39A2FB181081A96C63}, kind: 8)
[2022-01-11T16:38:05Z DEBUG zenoh::net::routing::queries] Register peer queryable /@dds_pub_cache/B6F8FB5759A44D39A2FB181081A96C63/rt/tf_static (peer: B6F8FB5759A44D39A2FB181081A96C63, kind: 8)
[2022-01-11T16:38:05Z DEBUG zplugin_dds::dds_mgt] Discovered DDS subscription ed3c1001c01efcbbeb2528b900001504 from Participant ed3c1001c01efcbbeb2528b9000001c1 on rt/tf_static with type tf2_msgs::msg::dds_::TFMessage_ (keyless: true)
[2022-01-11T16:38:05Z DEBUG zplugin_dds::dds_mgt] Ignoring discovery of ed3c1001c01efcbbeb2528b900001504 (rt/tf_static) from local participant
[2022-01-11T16:38:05Z INFO  zplugin_dds] New route: DDS 'rt/tf_static' => zenoh '/rt/tf_static' with type 'tf2_msgs::msg::dds_::TFMessage_'

[2022-01-11T16:38:19Z DEBUG zplugin_dds::dds_mgt] Discovered DDS publication f19810013ed8411c12de60fc00001603 from Participant f19810013ed8411c12de60fc000001c1 on rt/tf_static with type tf2_msgs::msg::dds_::TFMessage_ (keyless: true)
[2022-01-11T16:38:19Z DEBUG zplugin_dds] Discovered DDS Writer on rt/tf_static: f19810013ed8411c12de60fc00001603
[2022-01-11T16:38:19Z DEBUG zplugin_dds] Route from DDS to resource /rt/tf_static already exists -- ignoring

Expected result

A late subscriber to /tf_static should receive one message from all publishers.

@JEnoch
Copy link
Member

JEnoch commented Jan 12, 2022

For scalability reasons, the zenoh-bridge-dds creates only 1 dds-to-zenoh route (i.e. 1 DDS Reader re-publishing to zenoh) for all discovered DDS Writers on a same topic. That's the reason of this log:
[2022-01-11T16:38:19Z DEBUG zplugin_dds] Route from DDS to resource /rt/tf_static already exists -- ignoring

In case of TRANSIENT_LOCAL DDS writers, the route is associated to a PublicationCache that will cache the data that are re-published to zenoh. This cache size is set to the history length of the first discovered Writer (1 in your case). And the cache doesn't make distinction between data published by different DataWriter (i.e. only 1 publication is cached for all the Writers).

A solution would be to change the caching strategy to maintain 1 cache per-matching DDS Writer within the route. The QueryingSubscriber used by the remote bridge for the corresponding zenoh-to-dds route will at creation query and receive all cached data.

However, I have a question on your use case: as far as I understood, you're using 1 zenoh-dds-bridge for several devices that are publishing on /tf_static.
Don't your subscriber on this topic needs to distinguish each device's publication on this topic ? I don't see any key in TFMessage that would allow to make the distinction between 2 publications.
So did you consider to use 1 zenoh-dds-bridge per device , each with its own scope (--scope device_id option), leading to re-publication on different zenoh resources per device (/device_id/rt/tf_static) ?

@eliasdc
Copy link
Author

eliasdc commented Jan 13, 2022

I like the idea of having 1 cache per-matching DDS Writer as this closely resembles the behavior of ROS2.

My case is where I have a realsense camera mounted on a robot. The transforms of the robot's links are describe in the URDF file and published on /tf_static witht the robot_state_publisher node up until the camera_link. The realsense camera publishes its own internal transforms from its own node on the /tf_static link. To get the point clouds working in rviz2 I need all the transforms on a single /tf_static topic. I know the possibility of having multiple zenoh bridges with and without scope but this doesn't solve the problem. The TFMessage has a frame_id and a child_frame_id which is unique combination, resulting in a TF tree connecting all links to each other, knowing who is the publisher is not required. However, I did solve my own problem by including the camera's transforms in my URDF file as a work around (less accurate) and stop publishing the transforms from the camera's node. One downside of this workaround is that if I replace the camera I need to update the URDF itself.

I still believe that this should be solved with Zenoh as the goal is to be able to replace any communication link from ROS2 without needing to adjust ROS2 specific code.

@JEnoch
Copy link
Member

JEnoch commented Jan 14, 2022

I agree that multiple TRANSIENT_LOCAL publishers should be supported.
I created #64 for this purpose.

In the meantime, I'm glad you have a workaround.

@Mallets
Copy link
Member

Mallets commented Apr 4, 2022

Please refer to #64 for further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants