-
Notifications
You must be signed in to change notification settings - Fork 1.6k
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve global concurrency limit management for deployment concurrency limiting #15375
Comments
Ok, I think we're mostly on the same page. I have a couple of notes. First, everything about the FK and create/update endpoints looks right to me. Second, we'll need to get more input on this from @cicdw in particular, but the constraint I want to make is that the UX that users interact with to create and edit deployments -- across the SDK, UI, CLI, etc. -- is the version we designed that starts simple and expands from there as the user needs. So it's something like: concurrency_limit: Union[int, ConcurrencyOptions] Where class ConcurrencyOptions:
# At a minimum: an int. However, we could expand this later
# to support the name of an existing GCL or a full GCL spec.
limit: Union[int, str, ConcurrencyLimitV2]
# How to handle runs blocked by a limit
collision_strategy: Literal["enqueue", "cancel-new"]
# ... other options, etc. Then the API response, which is really intended for our SDK and UI to use, includes the details we need during operations, so the full embedded GCL: {
# ... other deployment fields
concurrency_limit: Optional[ConcurrencyLimitV2] # embed GCL
concurrency_options: Optional[ConcurrencyOptions]
} So user-facing, we use the "simple" interface, but we always return the GCL embedded in API responses. The worker, runner, etc. will need the name of the limit, which, now that we're making its GCL visible to the user, can be renamed. So we need the name from the GCL record. For your FAQ, I agree with both points. |
Right, right.
that I put above is incorrect because to your point folks could rename them. Editing that out which adds to the potential desire to nest these objects into at least the response. So DeploymentResponse will be expanded as: {
# ... other deployment fields
concurrency_limit: Optional[ConcurrencyLimitV2] # embed GCL rather than only the limit int
concurrency_options: Optional[ConcurrencyOptions]
} And for now since we're not yet supporting user-supplied GCL's, the request schemas can remain as just: concurrency_limit: Optional[int]
concurrency_options: Optional[ConcurrencyOptions] Later we can figure out adjusting requests to support |
Update: The Thus the response should be updated as: {
# ... other deployment fields
concurrency_limit: Optional[int]
global_concurrency_limit: Optional[ConcurrencyLimitV2] # embed GCL rather than only the limit int
concurrency_options: Optional[ConcurrencyOptions]
} However for the smoothest compatibility story going forward, the |
Describe the current behavior
Deployments have a field called
concurrency_limit
which maps to an int column in the database on the deployment table.This limit is used to define and maintain an underlying Global Concurrency Limit (concurrency_limit_v2) object. The integer value maps directly to the underlying GCL's limit field and column in the database.
Having this
limit
in 2 different places presents a data integrity issue whereby we need to maintain bidirectional syncing of these two values so that updates to a deployment.concurrency_limit propagate back to concurrency_limit_v2.limit and vice versa. This is non-trivial and invasive across multiple APIs to accomplish and a shortcut was taken to address this problem - namely upserting the global concurrency limit object at runtime in worker/runners. This solves the provisioning and syncing problem by lazily syncing the deployment's value onto the GCL's at worker/runner pull time.However this approach was a shortcut taken to unblock a pending release and has a few issues still:
Describe the proposed behavior
DB changes:
Normalize the data model so that there aren't multiple sources of truth to keep in sync.
deployment.concurrency_limit: int
->deployment.concurrency_limit: FK-to-concurrency_limit_v2.id
For migration purposes, we'll introduce a new column called deployment.concurrency_limit_id which is a foreign key to concurrency_limit_v2.id. The old int column may remain for a period before then being removed.
Now modifying a deployment.concurrency_limit will behind-the-scenes modify the related concurrency_limit_v2.limit value in the database.
API changes:
Update deployments: PATCH /deployments/{id}
Delete deployment: DELETE /deployments/{id}
We should additionally consider nesting the related concurrency object into deployment api schemas:
concurrency_limit: int
value vs.concurrency_limit: { limit: int }
name: str
viaconcurrency_limit: { limit: int, name: str }
rather than expanding the top-level with anotherconcurrency_limit_name
fieldconcurrency_limit
object vs.concurrency_limit_id
the worker/runners don't need any info about the underlying limit at least atm since the limit name will be deterministic.Since users own the underlying GCL and could freely change its name, the name is no longer deterministic. Also if/once users could provide their own limit names, then the worker/runners would have to load those.Note
ensure that on deletion, concurrency_limit_v2 objects associated with deployments get their back refs updated correctly
Example Use
No response
Additional context
FAQ:
Should we extend support for user-provided limits/limit-names? There's a compelling use-case of being able to share a concurrency limit object across deployments (and/or other workflows as GCL's can be used outside of flows)
We may build this later. The current proposal should be extensible for this as the APIs would just need to shift to take in a
str
name as well and then the underlying data model doesn't need any changes, but rather the logic to create GCL's shifts to just reuse an existing one if one exists matching the provided nameShould we build to support having a deployment utilize multiple concurrency limit objects? The
concurrency
helper allows providing multiple limit names for use in acquiring slots. We could extend the current deployment concurrency limit interfaces to handle this too. Possibly worth considering here as it would affect the data model toward a many-to-many from a one-to-many.I think we should punt on this until there's a stronger use-case/desire for this as it introduces additional complexity in various places and risks getting in the way of other work like #15340.
The text was updated successfully, but these errors were encountered: