Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deadlock in sensors system #2025

Merged
merged 1 commit into from
Jul 7, 2023
Merged

Prevent deadlock in sensors system #2025

merged 1 commit into from
Jul 7, 2023

Conversation

iche033
Copy link
Contributor

@iche033 iche033 commented Jun 28, 2023

🦟 Bug fix

Summary

The threading and lock mechanism in sensors system can potentially run into deadlock, especially after changes in #1938.

This deadlock is found by users of gz-sim that use a custom sim update loop. I haven't been able to reproduce it in gz-sim yet but I was able to reproduce it with the custom update loop when the system in under high load.

More details on the deadlock:

There are 2 threads in the sensors system that use a condition variable to unblock each other in order to do a soft lockstep between physics and sensors. Deadlock occurs when both threads are waiting on each other:

  • rendering thread's wait
  • main thread's wait

In my testing, I noticed that the main thread's notify_one call does not always unblock the rendering thread's wait immediately. So if the main thread continues to the next iteration and waits again before the rendering thread wakes up, deadlock occurs.

The workaround is to add a timeout to the wait call in the rendering thread so that it does not wait forever. The side effect of this change is that the rendering thread is now semi-polling for updates. If there are sensors that need update, there shouldn't be any noticeable change. However, if no sensors need to be updated, we'll see that the rendering thread would wake up every second but does nothing and goes back to waiting again.

Checklist

  • Signed all commits for DCO
  • Added tests
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • Consider updating Python bindings (if the library has them)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

@iche033 iche033 requested a review from mjcarroll June 28, 2023 22:39
@github-actions github-actions bot added the 🌱 garden Ignition Garden label Jun 28, 2023
@codecov
Copy link

codecov bot commented Jun 28, 2023

Codecov Report

Merging #2025 (ddf9bdb) into gz-sim7 (001dd9b) will increase coverage by 0.02%.
The diff coverage is 100.00%.

❗ Current head ddf9bdb differs from pull request most recent head 8743ce1. Consider uploading reports for the commit 8743ce1 to get more accurate results

@@             Coverage Diff             @@
##           gz-sim7    #2025      +/-   ##
===========================================
+ Coverage    64.99%   65.01%   +0.02%     
===========================================
  Files          353      353              
  Lines        28618    28620       +2     
===========================================
+ Hits         18600    18608       +8     
+ Misses       10018    10012       -6     
Impacted Files Coverage Δ
src/systems/sensors/Sensors.cc 62.67% <100.00%> (-0.08%) ⬇️

... and 2 files with indirect coverage changes

@mjcarroll mjcarroll merged commit b2c576b into gz-sim7 Jul 7, 2023
@mjcarroll mjcarroll deleted the sensors_wait branch July 7, 2023 20:44
iche033 added a commit that referenced this pull request Jul 27, 2023
iche033 added a commit that referenced this pull request Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌱 garden Ignition Garden
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants