Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Crash of RoutingManagerd version 3.4.10 #730

Open
akhzarj opened this issue Jun 26, 2024 · 7 comments
Open

[BUG]: Crash of RoutingManagerd version 3.4.10 #730

akhzarj opened this issue Jun 26, 2024 · 7 comments
Labels

Comments

@akhzarj
Copy link

akhzarj commented Jun 26, 2024

vSomeip Version

v.3.4.10

Boost Version

1.78.0

Environment

Target: Test bench with automated test running
OS: Embedded Linux

Describe the bug

During the testing activities we observed several (around 6 times) crashes of routingmanagerd.

  • routingmanagerd core dumped with SIGSEGV, Segmentation fault
  • routingmanagerd core dumped with SIGABRT, Aborted

Details in provided back-traces.

Reproduction Steps

Several hundred (300-400) Loop test on the target by running various applications.

Expected behaviour

routingmanagerd should not crash.

Logs and Screenshots

No response

@akhzarj akhzarj added the bug label Jun 26, 2024
@lutzbichler
Copy link
Collaborator

I am very interested in reproducing this. Could you provide some more details about the "Reproduction Steps", especially which applications were used and how exactly the test loops look like?

@duartenfonseca
Copy link
Collaborator

@akhzarj can you give some indications on how we could reproduce it?

@akhzarj
Copy link
Author

akhzarj commented Oct 10, 2024

Hi @duartenfonseca , @lutzbichler
We already find out the root cause and it is related to dangling pointers in
implementation/endpoints/src/tcp_client_endpoint_impl.cpp
with the strand::dispatch() behavior dualism:
https://www.boost.org/doc/libs/1_80_0/doc/html/boost_asio/reference/strand/dispatch.html

When strand is busy then passed function will be scheduled and execute after return from dispatch() and it's caller function and then the passed references to local variables will become dangled.
To be able to do reproduction the appropriate strands needs to be stressed to become busy.

The fix is removing references in:

Notes:

  • In the last one the fix in addition replaces lambda with std::bind() due to lambda immutability, alternative make lambda mutable.
  • It is not checked against the latest version of vsomeip, but you can easily if any new/updated strand::dispatch()
    contain the same issue.

@fcmonteiro
Copy link
Collaborator

Hi @akhzarj,
So, If I remember correctly this is the same issue we discussed some time ago in the monthly meeting.
The fix is not yet in the master, but I asked @kheaactua to create a PR with the fix.
Can you have a look at #774, and update it. I seem it does not contain all changes.
Thanks! :)

@kheaactua
Copy link
Contributor

hmm, it looks like I am missing:

strand_.dispatch([self, &_recv_buffer, _recv_buffer_size, its_missing_capacity](){

I'll add that now.

kheaactua added a commit to kheaactua/vsomeip-sd that referenced this issue Oct 10, 2024
This reference was reported in COVESA#730

Co-authored-by: AramKh
@akhzarj
Copy link
Author

akhzarj commented Oct 11, 2024

Hi @fcmonteiro
Yes you are remembering it correctly and PR #774 contains the fix that we have
with last update from @kheaactua .

kheaactua added a commit to kheaactua/vsomeip-sd that referenced this issue Oct 11, 2024
These were reported in COVESA#730

Co-authored-by: Aram Khzarjyan <[email protected]>
Co-authored-by: Sambit Patnaik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants