Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiThreadedExecutor should use assigned CPUs in the process space if possible. #2037

Open
fujitatomoya opened this issue Oct 31, 2022 · 8 comments
Labels
backlog enhancement New feature or request

Comments

@fujitatomoya
Copy link
Collaborator

Feature request

Feature description

MultiThreadedExecutor should honor CPU affinity to be set by user application for process space, including inherited affinity.
ros2/rclpy#1031 includes this configuration, so the same behavior should go to rclcpp.
Using single thread with MultiThreadedExecutor will print the warning since it is not recommended, to avoid the possible problem which is described #2029.
But if the CPU affinity is set by user application, it will honor that configuration.

Implementation considerations

  • Implement abstracted function in rcutils so that we can conceal the platform implementation in client libraries.

Related PRs

@smorita-esol
Copy link

It would be nice if the new feature included not only the core number but also a configuration set of pinned cores and priorities for each thread.
The existing MultithreadExecutor does not care about which core each thread is assigned to and what priority is set for each.
In a Linux environment, These thread attribute settings could be partially done by using CPU_SET or chrt features, but there might be some environments that do not have such features.

@fujitatomoya
Copy link
Collaborator Author

It would be nice if the new feature included not only the core number but also a configuration set of pinned cores and priorities for each thread.

i am not sure about this. probably this is out of scope from this issue atm.

The existing MultithreadExecutor does not care about which core each thread is assigned to and what priority is set for each.

So is SingleThreadedExecutor. For me, this can be applied to SingleThreadedExecutor, not only for MultiThreadedExecutor.
Even if we can set the thread priority for each thread in MultiThreadedExecutor, we cannot control which thread can take which event. I think it makes more sense that if we can set the priority for the event in this MultiThreadedExecutor case?

i might be mistaken. if that so, could you elaborate the scenario with MultiThreadedExecutor?

@smorita-esol
Copy link

i am not sure about this. probably this is out of scope from this issue atm.

Sorry for my misunderstanding. The title includes "Core affinity," so I thought you were also interested in implementing the core affinity settings (pinned core) for each executor thread, not only the number of cores.

So is SingleThreadedExecutor. For me, this can be applied to SingleThreadedExecutor, not only for MultiThreadedExecutor.
Even if we can set the thread priority for each thread in MultiThreadedExecutor, we cannot control which thread can take which event. I think it makes more sense that if we can set the priority for the event in this MultiThreadedExecutor case?

Almost totally agree. However, I'm not sure if the controlled unit should be "event," "event chain," or other things, so far.
But, even at this point, it seems beneficial for us to discuss how to control the scheduling parameters (including core affinity) for each thread belonging to the executor thread pool.

i might be mistaken. if that so, could you elaborate the scenario with MultiThreadedExecutor?

Thanks for the suggestion, but I have to investigate some more.
When done, I'll post it in another proper place (ROS Discourse?).

@fujitatomoya
Copy link
Collaborator Author

sorry for the confusion. thanks for the information, that can be also discussed.

But, even at this point, it seems beneficial for us to discuss how to control the scheduling parameters (including core affinity) for each thread belonging to the executor thread pool.

that could be useful for mission critical application. i believe those topics have been discussed in micro-ROS community as well.

CC: @JanStaschulat @ralph-lange

@ralph-lange
Copy link
Contributor

ralph-lange commented Dec 1, 2022

Even if we can set the thread priority for each thread in MultiThreadedExecutor, we cannot control which thread can take which event. I think it makes more sense that if we can set the priority for the event in this MultiThreadedExecutor case?

Besides the fact that the current MultiThreadedExecutor interface does not allow assigning certain events (subscriptions, timers, etc.) to specific threads, priority inheritance on the wait_mutex_ of MultiThreadedExecutor would be required for querying new work from the middleware layer (rmw) using the wait_set.

This querying works as follows: The first thread of the MultiThreadedExecutor that detects that the current wait_set has been taken completely calls rcl_wait to populate the wait_set with new events from the middleware layer. When another thread of the MultiThreadedExecutor finishes its current work, it will not call rcl_wait again, but simply wait for the first thread to return from rcl_wait with the updated wait_set. If this second thread would have a higher priority than the first thread, the first thread would have to inherit the second thread’s priority to avoid that it is preempted by another thread with medium priority between the first and the second thread.

Note that since Galactic, callback groups of the same node may be distributed to multiple executor instances using add_callback_group, i.e., one may create multiple MultiThreadedExecutor and/or SingleThreadedExecutor instances prioritize the thread(s) of each instance individually and distribute the callback groups specifically to them.

@JanStaschulat
Copy link

JanStaschulat commented Dec 1, 2022

It would be nice if the new feature included not only the core number but also a configuration set of pinned cores and priorities for each thread.

Alternative 1: Dispatcher Executor
In micro-ROS, we have designed a "Dispatcher Executor" to provide real-time multi-threaded scheduling, which works as follows:

Configuration:

At runtime:

  • The Executor runs in a dedicated thread and checks for new data in the DDS queue (with rcl_wait)
  • If a new message is available and the corresponding thread is ready (e.g. not currently processing some callback function), it takes the new message from the DDS-queue (rcl_take) and notifies the corresponding worker thread via a condition variable. The message is not copied, but only the pointer to it passed to the worker thread.
  • The worker thread wakes up, executes the callback function, and goes to sleep again.

So, the rclc-Executor design is different to rclcpp Executor, that not each thread in the thread pool calls rcl_wait/rcl_take, but only one dedicated thread is responsible for that. The rclc-Executor povides currently a one-to-one correspondence of a worker thread with a (real-time) priority to a particular handle (subscription). This concept could be extended such that one thread is responsible for multiple handles.

Limitations:

  • prioritization on the level of a handle
  • only subscription handle implemented
  • one-to-one correspondence between a handle and a worker thread; no thread-pool.
  • additional waiting time delay in Dispatcher Executor thread to detect, that 1) new data is available 2) a thread became ready. Background: a triggered guard_condition does not immediately wake up rcl_wait in micro-ROS. Therefore, it is not possible to detect while waiting for new messages, that a thread finished its execution and is ready again. It might take some time (timeout of rcl_wait) until a ready worker thread is detected. (A message is only taken with rcl_take, if the corresponding worker thread is ready.)

Further material

Alternative 2: API for callback-groups and real-time threads

As a second alternative, one could also design a real-time Executor in rclcpp using multiple callback groups. Each callback group would be executed by a dedicated (multi-threaded) Executor, which runs in a dedicated real-time thread. Then the user could choose the callback group, and thereby, configuring the priority of the thread, in which the callback function will then be processed.

Several people in the community have been requesting some real-time configuration in ROS 2. As a first step, we could setup an example with callback-groups and then design a simple API, so that the user only has to specifiy the priority and the organizational stuff (like callback groups, creation of real-time threads is done in the background).

@fujitatomoya fujitatomoya changed the title MultiThreadedExecutor CPU affinity aware MultiThreadedExecutor should use assigned CPUs in the process space if possible. Dec 1, 2022
@fujitatomoya
Copy link
Collaborator Author

@ralph-lange @JanStaschulat

thank you so much for introducing information and explanation in detail. really informative.

currently rclcpp or rclpy does not have any capability to control, thread affinity, priority, and matching thread and event at all. it just processes everything with single or multi threads equally. (order to take event is statically implemented too.)
i am not really sure if executor should be capable of doing this, but definitely worth to discuss more in real-time use cases.

@smorita-esol hope this helps you. probably we can open the new issue for this discussion.

I would like to have this issue dedicated to be a counter part of ros2/rclpy#1031. And adjusted the title a bit.

@smorita-esol
Copy link

FYI

> i might be mistaken. if that so, could you elaborate the scenario with MultiThreadedExecutor?
Thanks for the suggestion, but I have to investigate some more.
When done, I'll post it in another proper place (ROS Discourse?).

I've created an issue and implemented a proposal code to control the thread OS scheduling parameters as below.
ros-realtime/ros-realtime.github.io#18

If you are interested in the feature, I'd appreciate it if you could make some comments on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants