WIP: add jobtap plugin for preemption#6580
Open
garlick wants to merge 2 commits intoflux-framework:masterfrom
Open
WIP: add jobtap plugin for preemption#6580garlick wants to merge 2 commits intoflux-framework:masterfrom
garlick wants to merge 2 commits intoflux-framework:masterfrom
Conversation
Problem: RFC 14 specifies that a job can set the preemptible-after system attribute in their jobspec to signal that the job can be preempted, but flux-core and flux-sched currently ignore this. The scheduler is the proper place to handle preemption, but in the mean time, provide a jobtap plugin that can do a sloppy job of it without knowledge of the schedule.
Problem: there are no tests for the killbot preemption plugin. Add a sharness test.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6580 +/- ##
==========================================
+ Coverage 79.47% 79.49% +0.01%
==========================================
Files 531 532 +1
Lines 88312 88607 +295
==========================================
+ Hits 70184 70434 +250
- Misses 18128 18173 +45
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a first cut at a jobtap plugin that dispatches preemptible jobs when there is job pressure from non-preemptible jobs, as discussed in #6524. It handles
preemptible-after > 0and doesn't preempt jobs in queues that don't have pressure.That's about the extent of its smarts!
There's a comment at the top of the plugin that explains how it works, but the quick summary is that there are currently two "handlers" that are invoked periodically while there are potential victims and job pressure. If the
overkillhandler is selected, it kills all eligible victims in one go. If theonekillhandler is selected, it kills one random victim, then waits a while and tries again if there is still pressure.This is a WIP because probably we'll need another handler that's smart about node counts, and because testing doesn't cover queues yet. I was reaching a quitting point for the weekend and wanted to get this up.