Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve global concurrency limit management for deployment concurrency limiting #15375

Open
collincchoy opened this issue Sep 13, 2024 · 3 comments · May be fixed by #15426
Open

Improve global concurrency limit management for deployment concurrency limiting #15375

collincchoy opened this issue Sep 13, 2024 · 3 comments · May be fixed by #15426
Assignees
Labels
enhancement An improvement of an existing feature

Comments

@collincchoy
Copy link
Contributor

collincchoy commented Sep 13, 2024

Describe the current behavior

Deployments have a field called concurrency_limit which maps to an int column in the database on the deployment table.

This limit is used to define and maintain an underlying Global Concurrency Limit (concurrency_limit_v2) object. The integer value maps directly to the underlying GCL's limit field and column in the database.

Having this limit in 2 different places presents a data integrity issue whereby we need to maintain bidirectional syncing of these two values so that updates to a deployment.concurrency_limit propagate back to concurrency_limit_v2.limit and vice versa. This is non-trivial and invasive across multiple APIs to accomplish and a shortcut was taken to address this problem - namely upserting the global concurrency limit object at runtime in worker/runners. This solves the provisioning and syncing problem by lazily syncing the deployment's value onto the GCL's at worker/runner pull time.

However this approach was a shortcut taken to unblock a pending release and has a few issues still:

  • in the other direction. If a user edits the underlying GCL directly, then their edit will be overwritten on the next worker/runner upsert which prioritizes the deployment.concurrency_limit value as the source of truth.
    • A prior alternative to solve this problem included making this special GCL objects read-only or not user-facing/editable, but we've decided against this as it adds additional complexity to the system to have a special type of GCL's in this way.
  • this causes the GCL to be lazily provisioned rather than created on deployment create/update which is likely less-intuitive and unexpected behavior

Describe the proposed behavior

DB changes:

Normalize the data model so that there aren't multiple sources of truth to keep in sync.
deployment.concurrency_limit: int -> deployment.concurrency_limit: FK-to-concurrency_limit_v2.id

For migration purposes, we'll introduce a new column called deployment.concurrency_limit_id which is a foreign key to concurrency_limit_v2.id. The old int column may remain for a period before then being removed.

Now modifying a deployment.concurrency_limit will behind-the-scenes modify the related concurrency_limit_v2.limit value in the database.

API changes:

  • Create (upsert) deployments: POST /deployments
    • if a deployment is created with a concurrency_limit value, then create a concurrency_limit_v2(name="deployment:{deployment_id}", limit=concurrency_limit, active=True)
    • Note: this route handles updating pre-existing by name/flow_id so need to handle updates as well
      • if a deployment is updated with a concurrency_limit value that is different, then handle updating the underlying concurrency_limit_v2 object's limit value
      • if a deployment is updated from having a concurrency_limit value to not having one, then delete the underlying concurrency_limit_v2 object.
  • Update deployments: PATCH /deployments/{id}

    • if a deployment is updated with a concurrency_limit value that is different, then handle updating the underlying concurrency_limit_v2 object's limit value
    • if a deployment is updated from having a concurrency_limit value to not having one, then delete the underlying concurrency_limit_v2 object.
  • Delete deployment: DELETE /deployments/{id}

    • if deployment.concurrency_limit_id is not None: delete_concurrency_limit_v2(deployment.concurrency_limit_id)

We should additionally consider nesting the related concurrency object into deployment api schemas:

  • Create/Update requests with a top-level concurrency_limit: int value vs. concurrency_limit: { limit: int }
    • this could be used to later extend to intake a name: str via concurrency_limit: { limit: int, name: str } rather than expanding the top-level with another concurrency_limit_name field
    • should api clients be able to edit the underlying concurrency object directly via the deployments apis?
  • DeploymentResponse with a nested concurrency_limit object vs. concurrency_limit_id
    • do we want to eager load these for responses or have clients handle fetching the related limit object
    • consider both the UI and the worker/runners.
      • the worker/runners don't need any info about the underlying limit at least atm since the limit name will be deterministic. Since users own the underlying GCL and could freely change its name, the name is no longer deterministic. Also if/once users could provide their own limit names, then the worker/runners would have to load those.
      • the UI could load and manage the GCL separately via concurrency_limit_v2 APIs
    • Should consider deduplicating N+1 queries when loading multiple deployments

Note

ensure that on deletion, concurrency_limit_v2 objects associated with deployments get their back refs updated correctly

Example Use

No response

Additional context

FAQ:

Should we extend support for user-provided limits/limit-names? There's a compelling use-case of being able to share a concurrency limit object across deployments (and/or other workflows as GCL's can be used outside of flows)

We may build this later. The current proposal should be extensible for this as the APIs would just need to shift to take in a str name as well and then the underlying data model doesn't need any changes, but rather the logic to create GCL's shifts to just reuse an existing one if one exists matching the provided name

Should we build to support having a deployment utilize multiple concurrency limit objects? The concurrency helper allows providing multiple limit names for use in acquiring slots. We could extend the current deployment concurrency limit interfaces to handle this too. Possibly worth considering here as it would affect the data model toward a many-to-many from a one-to-many.

I think we should punt on this until there's a stronger use-case/desire for this as it introduces additional complexity in various places and risks getting in the way of other work like #15340.

@collincchoy collincchoy added the enhancement An improvement of an existing feature label Sep 13, 2024
@collincchoy collincchoy self-assigned this Sep 13, 2024
@abrookins
Copy link
Collaborator

abrookins commented Sep 13, 2024

Ok, I think we're mostly on the same page. I have a couple of notes.

First, everything about the FK and create/update endpoints looks right to me.

Second, we'll need to get more input on this from @cicdw in particular, but the constraint I want to make is that the UX that users interact with to create and edit deployments -- across the SDK, UI, CLI, etc. -- is the version we designed that starts simple and expands from there as the user needs.

So it's something like:

    concurrency_limit: Union[int, ConcurrencyOptions]

Where ConcurrencyOptions is the expanded options object

class ConcurrencyOptions:
    # At a minimum: an int. However, we could expand this later
    # to support the name of an existing GCL or a full GCL spec.
    limit: Union[int, str, ConcurrencyLimitV2]

    # How to handle runs blocked by a limit
    collision_strategy: Literal["enqueue", "cancel-new"]

    # ... other options, etc.

Then the API response, which is really intended for our SDK and UI to use, includes the details we need during operations, so the full embedded GCL:

{
    # ... other deployment fields
    concurrency_limit: Optional[ConcurrencyLimitV2]  # embed GCL
    concurrency_options: Optional[ConcurrencyOptions]
}

So user-facing, we use the "simple" interface, but we always return the GCL embedded in API responses. The worker, runner, etc. will need the name of the limit, which, now that we're making its GCL visible to the user, can be renamed. So we need the name from the GCL record.

For your FAQ, I agree with both points.

@collincchoy
Copy link
Contributor Author

now that we're making its GCL visible to the user, can be renamed.

Right, right.

the worker/runners don't need any info about the underlying limit at least atm since the limit name will be deterministic.

that I put above is incorrect because to your point folks could rename them. Editing that out which adds to the potential desire to nest these objects into at least the response.

So DeploymentResponse will be expanded as:

{
    # ... other deployment fields
    concurrency_limit: Optional[ConcurrencyLimitV2]  # embed GCL rather than only the limit int
    concurrency_options: Optional[ConcurrencyOptions]
}

And for now since we're not yet supporting user-supplied GCL's, the request schemas can remain as just:

concurrency_limit: Optional[int]
concurrency_options: Optional[ConcurrencyOptions]

Later we can figure out adjusting requests to support str limit names and/or uuid limit ids. 👍

@collincchoy
Copy link
Contributor Author

collincchoy commented Sep 18, 2024

Update:

The DeploymentResponse should preserve the concurrency_limit: Optional[int] field for backwards-compatibility - particular for workers that previously/currently read that and expect it as an int.

Thus the response should be updated as:

{
    # ... other deployment fields
    concurrency_limit: Optional[int]  
    global_concurrency_limit: Optional[ConcurrencyLimitV2] # embed GCL rather than only the limit int
    concurrency_options: Optional[ConcurrencyOptions]
}

However for the smoothest compatibility story going forward, the concurrency_limit will always return None. This is to maintain compatibility without also having adverse effects in certain edge cases. For example, prior to these outlined changes, a user could update a backing GCL's name which would result in the worker/runner creating a new GCL to use at runtime. Rather than support this behavior, affected worker versions will see concurrency_limit: None effectively nullifying deployment-based concurrency limiting. This should only be 1 affected version and users can upgrade to utilize the better proposal outlined here.

@collincchoy collincchoy linked a pull request Sep 18, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants