Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(NumberOfEntities) improve performance #1285

Merged
merged 2 commits into from
May 23, 2024
Merged

Conversation

MatthijsBurgh
Copy link
Contributor

@MatthijsBurgh MatthijsBurgh commented May 17, 2024

The MultiThreadedExecutor has very bad performance, see #1223.
Refactoring it, not easy task. So first fixing some low hanging fruit. By improving performance.

These changes are not compatible for classes inheriting from this class. But the repr wasn't compatible either. I don't see any class inheriting from this class any time.

Signed-off-by: Matthijs van der Burgh <[email protected]>
This does indeed not work child classes, but the __repr__ was also not working for child classes. And no currently there are no child classes, so performance is more important. See ros2#1223

Signed-off-by: Matthijs van der Burgh <[email protected]>
@fujitatomoya
Copy link
Collaborator

@MatthijsBurgh thanks for creating PR.

Could you add description here? like which issue are you trying to address and background information?

@MatthijsBurgh
Copy link
Contributor Author

@fujitatomoya done

@clalancette
Copy link
Contributor

@fujitatomoya done

Can you please explain why this change helps the situation?

@MatthijsBurgh
Copy link
Contributor Author

@clalancette please check the following performance example.

class NumberOfEntities:
    __slots__ = [
        "num_subscriptions",
        "num_guard_conditions",
        "num_timers",
        "num_clients",
        "num_services",
        "num_events",
    ]

    def __init__(
        self,
        num_subs=0,
        num_gcs=0,
        num_timers=0,
        num_clients=0,
        num_services=0,
        num_events=0,
    ):
        self.num_subscriptions = num_subs
        self.num_guard_conditions = num_gcs
        self.num_timers = num_timers
        self.num_clients = num_clients
        self.num_services = num_services
        self.num_events = num_events

    def __add__(self, other):
        result = self.__class__()
        for attr in result.__slots__:
            left = getattr(self, attr)
            right = getattr(other, attr)
            setattr(result, attr, left + right)
        return result

    def __iadd__(self, other):
        for attr in self.__slots__:
            left = getattr(self, attr)
            right = getattr(other, attr)
            setattr(self, attr, left + right)
        return self

    def __repr__(self):
        return "<{0}({1}, {2}, {3}, {4}, {5}, {6})>".format(
            self.__class__.__name__,
            self.num_subscriptions,
            self.num_guard_conditions,
            self.num_timers,
            self.num_clients,
            self.num_services,
            self.num_events,
        )


class NumberOfEntities2:
    __slots__ = [
        "num_subscriptions",
        "num_guard_conditions",
        "num_timers",
        "num_clients",
        "num_services",
        "num_events",
    ]

    def __init__(
        self,
        num_subs=0,
        num_gcs=0,
        num_timers=0,
        num_clients=0,
        num_services=0,
        num_events=0,
    ):
        self.num_subscriptions = num_subs
        self.num_guard_conditions = num_gcs
        self.num_timers = num_timers
        self.num_clients = num_clients
        self.num_services = num_services
        self.num_events = num_events

    def __add__(self, other):
        result = self.__class__()
        result.num_subscriptions = self.num_subscriptions + other.num_subscriptions
        result.num_guard_conditions = self.num_guard_conditions + other.num_guard_conditions
        result.num_timers = self.num_timers + other.num_timers
        result.num_clients = self.num_clients + other.num_clients
        result.num_services = self.num_services + other.num_services
        result.num_events = self.num_events + other.num_events
        return result

    def __iadd__(self, other):
        self.num_subscriptions += other.num_subscriptions
        self.num_guard_conditions += other.num_guard_conditions
        self.num_timers += other.num_timers
        self.num_clients += other.num_clients
        self.num_services += other.num_services
        self.num_events += other.num_events
        return self

    def __repr__(self):
        return "<{0}({1}, {2}, {3}, {4}, {5}, {6})>".format(
            self.__class__.__name__,
            self.num_subscriptions,
            self.num_guard_conditions,
            self.num_timers,
            self.num_clients,
            self.num_services,
            self.num_events,
        )


bla1 = NumberOfEntities(1, 2, 3, 4, 5, 6)
bla2 = NumberOfEntities(1, 2, 3, 4, 5, 6)
bla3 = NumberOfEntities2(1, 2, 3, 4, 5, 6)
bla4 = NumberOfEntities2(1, 2, 3, 4, 5, 6)
for _ in range(1000000):
    bla1 + bla2
real	0m0.753s
user	0m0.748s
sys	0m0.005s
for _ in range(1000000):
    bla1 += bla2
real	0m0.602s
user	0m0.601s
sys	0m0.000s
for _ in range(1000000):
    bla3 + bla4
real	0m0.398s
user	0m0.393s
sys	0m0.004s
for _ in range(1000000):
    bla3 += bla4
real	0m0.337s
user	0m0.333s
sys	0m0.005s

Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the performance analysis from #1223 (comment), _wait_for_ready_callbacks is one of the major performance issues. then having this fix to make it better to process https://github.com/ros2/rclpy/blob/rolling/rclpy/rclpy/executors.py#L655, right?

the downside could be that we need to update __add__ and __iadd__ once __slot__ changes, but i think __repr__ is already in that situation. trading off the performance, i think that is okay.

lgtm with green CI.

@clalancette @sloretz what do you think?

@MatthijsBurgh
Copy link
Contributor Author

@fujitatomoya your conclusion is correct

@clalancette
Copy link
Contributor

@clalancette @sloretz what do you think?

Yep, I think this is entirely reasonable to get a bit more performance out of this.

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me with green CI.

@fujitatomoya
Copy link
Collaborator

fujitatomoya commented May 21, 2024

CI:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@clalancette clalancette merged commit 786c464 into ros2:rolling May 23, 2024
3 checks passed
@MatthijsBurgh MatthijsBurgh deleted the patch-1 branch May 24, 2024 06:22
jplapp pushed a commit to pixel-robotics/rclpy that referenced this pull request Dec 5, 2024
* (NumberOfEntities) add __iadd__

* (NumberOfEntities) improve __add__ performance

This does indeed not work child classes, but the __repr__ was also not working for child classes. And no currently there are no child classes, so performance is more important. See ros2#1223

Signed-off-by: Matthijs van der Burgh <[email protected]>
(cherry picked from commit 786c464)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants