Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added realtime thread configuration support #406

Closed
wants to merge 9 commits into from

Conversation

WideAwakeTN
Copy link

@WideAwakeTN WideAwakeTN commented Jan 6, 2023

The changes in this pull request provide an OS independent method for setting a realtime thread priority and the CPU core affinity for a thread that has to be hard realtime and therefore is subject to FIFO scheduling as defined by the POSIX standard. In particular this pull request adds the new header thread.h which contains the function configure_native_realtime_thread.

This pull request is a prerequisite for the following pull request: ros2/rcpputils#174

Maybe someone can help making this pull request compile for QNX and VxWorks. I can just check the functionality for a Linux with RT patch. Since the content in this pull request focuses on realtime systems there is no support for Windows or Mac OS. On the latter two platforms the function configure_native_realtime_thread will simply return an error.

Outlook: it makes sense to also have a function in rcutils that allows setting the thread priority and CPU core affinity for fair/non-realtime scheduling use cases, but this functionality is not in the scope of this pull request which is part of a series of pull requests that ultimately aim to prevent priority inversions in ROS2 when using FIFO scheduling.

@fujitatomoya
Copy link
Collaborator

@WideAwakeTN thanks for opening PR.

The changes in this pull request provide an OS independent method for setting a realtime thread priority

I think if we stand with this motto, application should be consistent on any platform.

Since the content in this pull request focuses on realtime systems there is no support for Windows or Mac OS. On the latter two platforms the function configure_native_realtime_thread will simply return an error.

But actually it sometimes successes and sometimes fails based on the platform, as application perspective?

I am not against this approach, but it seems that PR only binds some system calls. that i think user application can do that w/o introducing specific APIs from ROS.
If i am not mistaken, i do not really see the benefit for ROS application to rely on these APIs, also including maintenance cost.

CC: @ros2/team @clalancette @ivanpauno @wjwwood

Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
Signed-off-by: Martin Mayer <[email protected]>
@WideAwakeTN
Copy link
Author

WideAwakeTN commented Jan 6, 2023

@fujitatomoya thanks for the fast reply. I didn't expect a draft pull request to get immediate attention.

But actually it sometimes successes and sometimes fails based on the platform, as application perspective?

Yes, but it will not be flaky/sporadic when running it, i.e. the functionality consistently succeeds or fails on a specific platform. There are platforms that support FIFO/realtime scheduling and some that do not. In case an application developer wants FIFO/realtime scheduling he has to chose an appropriate OS. In Linux the new rcutils function configure_native_realtime_thread will require an installed realtime kernel patch to succeed.

Why put thread configuration functionality into rcutils? ROS2 creates threads and mutexes internally, which an application developer cannot configure (let's ignore DDS threads for now). This situation is not satisfactory for scenarios with high demands on determinism and performance. With mutexes the situation can be easily be improved by only using mutexes with priority inheritance enabled, because these will cooperate nicely with FIFO/realtime scheduling (i.e. avoid priority inversion). Regarding internal ROS2 threads I see the necessity of being able to configure them (their scheduling type, priority and core affinity), maybe through ROS2 parameter files one day. In anticipation of this I put the realtime thread configuration functionality inside rcutils, instead of just putting it into the new unit test for mutexes which needs that functionality for provoking a thread priority inversion (see pull request ros2/rcpputils#174 ).

Unfortunately C++ does not offer suitable thread configuration APIs which in turn leads to platform dependent code. Even POSIX support does not solve the problem completely since it's usually not implemented fully. This situation is indeed not good for maintenance cost. An alternative would be to have the application programmer provide threads to ROS2 through some kind of new factory interface. I only created draft pull requests since I assumed that a discussion might make sense.

@fujitatomoya
Copy link
Collaborator

thanks for sharing the detailed description.

I didn't expect a draft pull request to get immediate attention.

I am interested in this. the earlier, the better 😄

the functionality consistently succeeds or fails on a specific platform.

this is correct, there are things we can abstract or not with software implementation. maybe this is not the case.
and i think application can be responsible for API to check the return code to be consistent in application space.

ROS2 creates threads and mutexes internally, which an application developer cannot configure (let's ignore DDS threads for now).

So are you saying there would be a plan to add configuration options for ROS 2 Executor for example? based on these additional APIs?

@WideAwakeTN
Copy link
Author

WideAwakeTN commented Jan 7, 2023

So are you saying there would be a plan to add configuration options for ROS 2 Executor for example?

I am not having a concrete plan yet, but some ideas. Our discussion actually made me aware that there are at least two fundamental approaches for getting an internal ROS2 thread configuration implemented:

  1. Put the basic thread configuration functionality into rcl or rcutils and add some code to the executor classes.
  2. Add a thread creation factory interface to rcpputils or rclcpp which can be passed to ROS2 entities, like the executors, via an optional constructor parameter. It is up to the application programmer to implement the thread creation factory interface for his platform.

Not sure yet which approach is better. Solution 2 might be interesting because it keeps the OS dependent part out of ROS2 and allows the application programmer to create thread pools (this might not be necessary or useful, though?). Solution 1 is nice because the executors could read the thread configuration from a ros parameters file which could make thread configuration very convenient, but there might also be a way to achieve that with solution 2, e.g. by providing an example/template implementation. Thoughts, opinions, suggestions?

In case solution 2 is the preferred one I would close this pull request and adapt my rcpputils pull request accordingly (ros2/rcpputils#174).

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/avoiding-priority-inversions-includes-draft-pull-requests/29219/1

@WideAwakeTN
Copy link
Author

A ROS Discourse article has been created for discussing this pull request:
https://discourse.ros.org/t/avoiding-priority-inversions-includes-draft-pull-requests/29219

@jrutgeer
Copy link

jrutgeer commented Jan 16, 2023

@WideAwakeTN Thanks for the PR, I think this is really valuable!

@fujitatomoya

I am not against this approach, but it seems that PR only binds some system calls. that i think user application can do that w/o introducing specific APIs from ROS.
If i am not mistaken, i do not really see the benefit for ROS application to rely on these APIs, also including maintenance cost.

Well, there are currently similar utility functions:

And there's probably others that I don't know of. That's a lot of duplicated effort.
Moreover, for an end-user like me (mechanical engineering background) these are not just "some system calls" but "yet another subject to master".

So whether or not this belongs in rcutils is open for debate, but in any case, given realtime support was one of the key use cases for ROS2, it surprises me that these kind of support functions (and probably also other OS abstractions) haven't been integrated in a core package since the beginning. Ideally I think above links should be merged into one default realtime utility repository (in the realtime support repository package?).

That being said, if no clear decision is taken wrt this PR, it should for sure not block ros2/rcpputils#174

@fujitatomoya
Copy link
Collaborator

@WideAwakeTN thanks for the comment.

given realtime support was one of the key use cases for ROS2

agree. again, i am not against this.

Ideally I think above links should be merged into one default realtime utility repository (in the realtime support repository package?).

i think that was the original idea for https://github.com/ros-realtime.

@carlossvg
Copy link

Ideally I think above links should be merged into one default realtime utility repository (in the realtime support repository package?).

I agree, both https://github.com/ros-realtime and https://github.com/ros2/realtime_support would be good places to implement this functionality.

We have a related issue open here: ros-realtime/community#14

@nightduck @shuhaowu @razr were interested in this topic so they may want to provide feedback or contribute

@WideAwakeTN
Copy link
Author

@carlossvg @fujitatomoya
Having a dedicated repo/package for realtime and OS specific code sounds like a good idea. Such a repo/package should be part of https://github.com/ros2 since other ROS2 core components might need access to that functionality. Thus https://github.com/ros2/realtime_support would be more suited than https://github.com/ros-realtime, I guess. The code of this PR, and PR ros2/rcpputils#174 , could certainly be transferred to https://github.com/ros2/realtime_support.

Another approach is to have all realtime and OS specific implementations hidden behind interfaces, from the point of view of core ROS2 components. It would then be up to the application programmer to provide actual implementations, but of course the community could provide packages with example implementations (one for each OS).

Not sure which of these two approaches is more advantageous. Thoughts, suggestions?

@fujitatomoya
Copy link
Collaborator

i am not sure if https://github.com/ros2/realtime_support is meant to be that scope? there seems to be tlsf test sample only.
for me, https://github.com/ros-realtime would be the best place for now, if this PR is dedicated to realtime capability.
i think this is the way it is for now, since https://github.com/ros-realtime provides many useful tools and features in realtime system.

Another approach is to have all realtime and OS specific implementations hidden behind interfaces, from the point of view of core ROS2 components. It would then be up to the application programmer to provide actual implementations, but of course the community could provide packages with example implementations (one for each OS).

this is ideal. user can plugin their own implementation at runtime.

IMO it would be nice to discuss on realtime support for ROS 2.
I understand that currently which is not in the mainline repo, https://github.com/ros2/ros2/blob/rolling/ros2.repos.
But realtime could be key value for ROS 2, besides significant structural change taking place right now, we probably want to discuss on this for future.

sorry for answering question with questions, i did not answer clearly, but i am open to hear more opinion.

@nightduck
Copy link

i am not sure if https://github.com/ros2/realtime_support is meant to be that scope? there seems to be tlsf test sample only.

In the RTWG meeting, we decided that the https://github.com/ros2/realtime_support repository would be best used as a directory for other realtime packages. Being in the base ros2 organization, it's more accessible to those who may not know exactly where to look.

There was talk of spinning out the tlsf sample into its own repo, and replace the contents of realtime_support with documentation and a repos file. Most of the repositories references would be hosted at https://github.com/ros-realtime.

Currently, we've been trying to get ahold of @methylDragon to gain admin access to the repo but have been unsuccessful

@methylDragon
Copy link
Contributor

i am not sure if https://github.com/ros2/realtime_support is meant to be that scope? there seems to be tlsf test sample only.

In the RTWG meeting, we decided that the https://github.com/ros2/realtime_support repository would be best used as a directory for other realtime packages. Being in the base ros2 organization, it's more accessible to those who may not know exactly where to look.

There was talk of spinning out the tlsf sample into its own repo, and replace the contents of realtime_support with documentation and a repos file. Most of the repositories references would be hosted at https://github.com/ros-realtime.

Currently, we've been trying to get ahold of @methylDragon to gain admin access to the repo but have been unsuccessful

Hey there! How did you reach out to me? I haven't received any communications :o

That aside, let me see what I can do about access

@nightduck
Copy link

Hey there! How did you reach out to me? I haven't received any communications :o

Oh hi! We reached out on your open robotics email a couple times a few weeks ago.

Since you're the maintainer of the tlsf_cpp and rttest, do you want to maintain ownership when we spin them out?

I made an issue over there where we can discuss more, so as not to hijack this thread.

@methylDragon
Copy link
Contributor

i am not sure if https://github.com/ros2/realtime_support is meant to be that scope? there seems to be tlsf test sample only.

In the RTWG meeting, we decided that the https://github.com/ros2/realtime_support repository would be best used as a directory for other realtime packages. Being in the base ros2 organization, it's more accessible to those who may not know exactly where to look.

There was talk of spinning out the tlsf sample into its own repo, and replace the contents of realtime_support with documentation and a repos file. Most of the repositories references would be hosted at https://github.com/ros-realtime.

Currently, we've been trying to get ahold of @methylDragon to gain admin access to the repo but have been unsuccessful

Hey @nightduck , after some internal discussions, I'm not sure that it is possible to give admin access to ros2 repositories to folks outside of the ros2 org. I think push interactions with this repo would then be the same with any other ros2 repo, where contributors fork and send PRs.

I'm not sure if this would change the decision to use this repo, or if another solution (like linking to the ros-realtime org in this repo's README would work better?

@methylDragon
Copy link
Contributor

Hey there! How did you reach out to me? I haven't received any communications :o

Oh hi! We reached out on your open robotics email a couple times a few weeks ago.

Since you're the maintainer of the tlsf_cpp and rttest, do you want to maintain ownership when we spin them out?

I made an issue over there where we can discuss more, so as not to hijack this thread.

Ah, I see an email that I missed, apologies for that!
I think there's no need for me to maintain ownership for any spun out packages if the packages aren't under the ros2 org, so don't worry about me!

@WideAwakeTN
Copy link
Author

@nightduck @fujitatomoya

In the RTWG meeting, we decided that the https://github.com/ros2/realtime_support repository would be best used as a directory for other realtime packages. Being in the base ros2 organization, it's more accessible to those who may not know exactly where to look.

Makes sense. I will close this draft pull request and already changed my related draft pull request ros2/rcpputils#174 accordingly (which adds mutex classes to ROS2 that have thread priority inheritance enabled).

Thanks everyone for participating in this discussion. Helped me getting a better feeling for ROS2 development. It makes sense to implement thread configuration functionality for ROS2 in a different manner than it was attempted here.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/avoiding-priority-inversions-includes-draft-pull-requests/29219/7

@WideAwakeTN
Copy link
Author

Another related question: should the code of my priority inheritance enabled mutex classes be moved from ros2/rcpputils to ros2/realtime_support ? What repo/package is more suitable?
(Draft PR link: ros2/rcpputils#174 )

@fujitatomoya
Copy link
Collaborator

@WideAwakeTN @nightduck @carlossvg

thank you so much for your time and discussion today.

this feature can only be enabled with realtime kernel patch, right? so most of the user cannot take advantage of this feature instead of building source code on top of realtime kernel? i mean that most of the user by here is that user installs packages w/o source code build.

if that is the case we have, probably it would be better to support this as https://github.com/ros-realtime to integrate all related code and procedure?

@jrutgeer
Copy link

@WideAwakeTN As ros2/realtime_support is rather an 'optional' package, I expect that a dependency of rclcpp on ros2/realtime_support is not desirable, so I'd say it's not an option to put ros2/rcpputils#174 in ros2/realtime_support.

@WideAwakeTN
Copy link
Author

WideAwakeTN commented Jan 31, 2023

@fujitatomoya

this feature can only be enabled with realtime kernel patch, right? so most of the user cannot take advantage of this feature instead of building source code on top of realtime kernel? i mean that most of the user by here is that user installs packages w/o source code build.

The new thread configuration functions of this draft PR require the RT patch for Linux during runtime but not during compile time. This means the compiled binaries would be the same, regardless if they were compiled on Linux with or without RT patch. The reason is that some enums/flags would have no effect without the RT patch or result in an error during a system function call. But indeed, the new function configure_native_realtime_thread would always fail when called on a Linux without RT patch.

I closed this draft PR since I guess some more discussions are needed regarding the whole thread configuration topic, i.e. how to do it and where to implement/store it, and my main concern for now is to get my new mutex classes reviewed and hopefully approved. I guess I will open a discussion topic on ROS discourse about realtime configuration some time in the coming weeks/months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants