Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] "long" actions still not returning result #367

Closed
li9i opened this issue Dec 20, 2024 · 2 comments
Closed

[Bug] "long" actions still not returning result #367

li9i opened this issue Dec 20, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@li9i
Copy link

li9i commented Dec 20, 2024

Describe the bug

See #210. From my investigation I think this may be related to eclipse-zenoh/zenoh#1409 and ros2/rmw_zenoh#173.

To reproduce

  1. Start two zenoh_bridge_ros2dds routers in two different machines.
  2. In the first machine:
export ROS_DOMAIN_ID=100
source /opt/ros/$ROS_DISTRO/setup.bash
ros2 run examples_rclcpp_minimal_action_server action_server_member_functions

and the second:

export ROS_DOMAIN_ID=200
source /opt/ros/$ROS_DISTRO/setup.bash
ros2 run examples_rclcpp_minimal_action_client action_client_member_functions
  1. If the fibonacci order is 1, then the execution time is less than a certain timeout threshold and the result returns from the action server to the action client. If the order is 10 then the execution time is more than the threshold and the result does not return. Client sees feedback either way.
  • Output of zenoh_bridge_ros2dds on branch fix_210
WARN                net-0 ThreadId(22) zenoh::net::routing::dispatcher::queries: Didn't receive final reply Face{2, 76a083bfcd61b967011025b6e1ded8bc}:3 from Face{1, 8f01344ee82211abd23a4b0ad673d0db}: Timeout(5s)!
WARN ThreadId(28) zenoh::net::routing::dispatcher::queries: Route reply Face{1, 8f01344ee82211abd23a4b0ad673d0db}:3 from Face{1, 8f01344ee82211abd23a4b0ad673d0db}: Query not found!
WARN ThreadId(28) zenoh::net::routing::dispatcher::queries: Route final reply Face{1, 8f01344ee82211abd23a4b0ad673d0db}:3 from Face{1, 8f01344ee82211abd23a4b0ad673d0db}: Query not found!
  • Output of zenoh_bridge_ros2dds on branch release/1.1.0
WARN                net-0 ThreadId(22) zenoh::net::routing::dispatcher::queries: Didn't receive final reply Face{1, e77ce76699699ce8c60eb08731c5f88d}:6 for Face{2, af5724c9827e036672269ac630c4475c}:6: Timeout(5s)!
WARN ThreadId(28) zenoh::net::routing::dispatcher::queries: Route reply Face{1, e77ce76699699ce8c60eb08731c5f88d}:6 from Face{1, e77ce76699699ce8c60eb08731c5f88d}: Query not found!
WARN ThreadId(28) zenoh::net::routing::dispatcher::queries: Route final reply Face{1, e77ce76699699ce8c60eb08731c5f88d}:6: Query not found!

System info

  • Platform: Ubuntu 22.04
  • ROS 2 version: humble
  • Zenoh version/commit:
    • from fix_210 (9fe9b507e0c937943af3a8c5929f4558ddfa8b4d)
    • to release/1.1.0 (2aa5b54f0fff9a6c6b027a68d535d64131cf1513)
@li9i li9i added the bug Something isn't working label Dec 20, 2024
@JEnoch
Copy link
Member

JEnoch commented Dec 27, 2024

In your 2 outputs, the issue is clearly a timeout on a query: Timeout(5s)!
The issue reported in #210 was not a timeout, but a change of values in the send_goal Request : "{a: 0, b: 0}" was received as "{a: 0, b: 65536}". So I confirm #210 is well fixed on this aspect.

Now, the timeout you get is the consequence of 2 things:

  • By default the bridge is configured with a timeout of 5 seconds for each Service call. See here.
  • Just after sending the goal, the action_client_member_functions directly calls the fibonacci/_action/get_result service. And as the Action Server takes more than 5 second to execute the action, you get this timeout warning and the bridge drops the action.

In my opinion ROS should not directly call the get_result after sending the goal, but should wait for a SUCCEEDED status. Thus, no Service call would be very long. But that's not how it's implemented...

Anyway, as you can see in the configuration file, you have the possibility to configure the timeout value for each service/action calls. With fibonacci action which takes 10 seconds, such configuration works:

// ...
queries_timeout: {
   actions: {
      get_result: ["fibonacci=12.0"],
   }
}
// ...

If you want to apply a 1 hour timeout for all actions, you can use such configuration:

// ...
queries_timeout: {
   actions: {
      get_result: [".*=3600.0"],
   }
}
// ...

Now the question of what shall be the default timeout value remains. I will discuss this in #369.

@JEnoch JEnoch closed this as completed Dec 27, 2024
@JEnoch JEnoch changed the title [Bug] fix_210 branch did not fix issue #210: "long" actions still not returning result [Bug] "long" actions still not returning result Dec 27, 2024
@li9i
Copy link
Author

li9i commented Dec 27, 2024

Thank you for the quick response. I traced the issue to #210 because of their common output (Didn't receive final reply ... Timeout(5s)! etc), therefore I thought that fix_210 addressed the timeout component of #210.

Thank you very much for your fine work, it is greatly appreciated. Happy holidays to everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants