Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce CPU usage when idle #775

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

scottwey
Copy link

Currently, the tight loop in Engine causes very high single core CPU usage when idle. This is also not great because this is long-running blocking code running inside of an async task, blocking an async worker entirely. On systems with lower core count, this will probably impact performance fairly negatively.

From my testing, this change dramatically drops CPU usage with minimal impact to performance, although I have not had a chance to benchmark properly.

Copy link

github-actions bot commented Sep 15, 2024

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                   12          105          104            0            1
 Python                 52         2268         1930           69          269
 TOML                   20          625          559            2           64
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          196          169            1           26
 (Total)                            273          201           32           40
-------------------------------------------------------------------------------
 Markdown               38         2765            0         2098          667
 |- BASH                 6          103          100            0            3
 |- JSON                 1           12           12            0            0
 |- Python               5           92           82            0           10
 |- Rust                 9          322          274            0           48
 |- TOML                 2           75           63            0           12
 (Total)                           3369          531         2098          740
-------------------------------------------------------------------------------
 Rust                  260        75712        68239         1548         5925
 |- Markdown           123         1217           25         1117           75
 (Total)                          76929        68264         2665         6000
===============================================================================
 Total                 393        82007        71273         3719         7015
===============================================================================
  

@scottwey
Copy link
Author

scottwey commented Sep 15, 2024

@EricLBuehler I switched to using yield_now only when nothing is scheduled instead of sleeping indiscriminately. I also did a bit of refactoring to hopefully make things a bit cleaner. I tested a bit and things are looking good. :)


self.scheduler.free_finished_sequence_groups();
if self.scheduler.waiting_len() == 0 {
tokio::task::yield_now().await;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the tokio::select! addition! I'm just a bit confused about how this function would implement what we want (that is, to not sit in a loop). If I understand correctly, it yields to tokio's runtime, which means we wait for the other arm of the tokio::select! (the request recieve arm) to match?

Perhaps you could add a comment here explaining what the logic/flow is?

Copy link
Author

@scottwey scottwey Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You understand my proposed flow correctly. My understanding of the reason we no longer sit in the loop anymore is mostly because of the yield_now; it basically marks the task as Pending for one iteration of Tokio's runtime. On the next iteration, yield_now will marked Ready and the loop can continue. Yielding doesn't take much time, but is enough to lower CPU usage dramatically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EricLBuehler Thoughts on this? Happy to close this and rethink if this change doesn't make sense to you.

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottwey I think after this change and if you could do some testing of it, this should be good to merge!

Finally, if you could drop some rough metrics on CPU usage before vs after it would be great too.

mistralrs-core/src/engine/mod.rs Outdated Show resolved Hide resolved
@scottwey
Copy link
Author

scottwey commented Oct 12, 2024

@EricLBuehler
Thanks for the patience here. I have had 0 time for anything besides work.
I finally had the time to leave this running and use it as a backend for a lot of personal inference and I don't see any obvious issues. Let me know if there are any specific things you want me to be on the lookout for.
In terms of CPU usage, it drops from a single core pegged at 100% down to ~1-2% on that core. Happy to gather more rigorous metrics if you would like. Just let me know what you would want to see.

Otherwise, I will get the conflicts sorted and await further review.

@EricLBuehler
Copy link
Owner

@scottwey thanks you for the testing! Great to see the 100% -> ~1% utilization :).

If you could resolve the conflicts, we can absolutely merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants