-
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batched job processing (opt-in) #474
Open
benjie
wants to merge
149
commits into
main
Choose a base branch
from
pool-centric
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+2,813
−524
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6 tasks
Things to check that I actually implemented (from my notes):
Also:
|
…ntext and share relevant types
… returning jobs fails)
…led a second time (e.g. from forcefulShutdown)
…when called a second time
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Replaces #99 and #470.
If you're at very high scale (e.g. you're running multiple Worker instances, and each instance has high concurrency) then the act of looking for and releasing jobs can start to dominate the load on the database. The PR gives the ability to configure Graphile Worker such that
getJob
(vialocalQueueSize
),completeJob
(viacompleteJobBatchDelay
) andfailJob
(viafailJobBatchDelay
) can be batched, thereby reducing this database load (and improving job throughput). This is an opt-in feature, via the following settings:localQueueSize >= 1
, Pools become responsible for getting jobs and will grab the number of jobs that you specify up front, and distribute these to workers on demand. This is done via a "Local Queue".completeJobBatchDelay >= 0
orfailJobBatchDelay >= 0
then pools are also now responsible for completing or failing jobs (respectively); they will wait the specified number of milliseconds after acompleteJob
orfailJob
call and batch any other calls made in the interrim; all of these results will be sent to the database at the same time reducing the total number of transactions.Note that enabling these features changes the behavior of Worker in a few ways:
Performance impact
If not enabled, impact is minimal.
If enabled, throughput improvement at the cost of potential latency increases.
The following results were produced with the following setup:
performance
governorBase performance:
Jobs per second: 16093.94
With
localQueueSize: 500
:Jobs per second: 35177.47
Performance with
localQueueSize: 500, completeJobBatchDelay: 0, failJobBatchDelay: 0
(note: even though the numbers are0
this still enables batching, it is just limited to (roughly) a single JS event loop tick):Jobs per second: 180684.70
You should note that the workload benchmarked here is a workload designed to put maximal stress on the database (i.e. the tasks are basically no-ops); YMMV with real-world loads.
The CPUs were configured with this script:
Security impact
Not known.
Checklist
yarn lint:fix
passes.yarn test
passes.RELEASE_NOTES.md
file (if one exists).If this is a breaking change I've explained why.