Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DIA-953: Stream results from adala inference server into LSE #75

Merged
merged 21 commits into from
Apr 23, 2024

Conversation

matt-bernstein
Copy link
Contributor

@matt-bernstein matt-bernstein commented Mar 27, 2024

  • add Label Studio SDK as a dependency (should be an optional dependency for the server only, but there's no clean way to do this in poetry)
  • switch ResultHandlers from an enum whitelist of functions to a registry of Pydantic models, same implementation as Environment/Skill/Runtime
  • add LSEHandler that pushes predictions to an LS project using an LS SDK Client
  • change celery worker functions to use pickle serialization, removing manual step of de/serializing objects on each worker call from an API endpoint
  • change utils and handlers import paths to be usable from both main app and celery workers
  • make AsyncKafkaEnvironment instantiable as prep for post-streaming refactoring

@robot-ci-heartex
Copy link
Collaborator

@codecov-commenter
Copy link

codecov-commenter commented Apr 9, 2024

Codecov Report

Attention: Patch coverage is 5.74713% with 82 lines in your changes are missing coverage. Please review.

Project coverage is 46.28%. Comparing base (d668350) to head (64e174a).
Report is 12 commits behind head on master.

Files Patch % Lines
server/handlers/result_handlers.py 0.00% 39 Missing ⚠️
server/app.py 0.00% 19 Missing ⚠️
server/tasks/process_file.py 0.00% 18 Missing ⚠️
adala/environments/kafka.py 50.00% 5 Missing ⚠️
server/utils.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master      #75      +/-   ##
==========================================
- Coverage   50.28%   46.28%   -4.01%     
==========================================
  Files          37       39       +2     
  Lines        1587     1735     +148     
==========================================
+ Hits          798      803       +5     
- Misses        789      932     +143     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@robot-ci-heartex
Copy link
Collaborator

@matt-bernstein matt-bernstein force-pushed the fb-dia-953/stream-results branch from 5b4e508 to d31318c Compare April 10, 2024 14:50
@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@matt-bernstein matt-bernstein marked this pull request as ready for review April 19, 2024 16:41
@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex robot-ci-heartex marked this pull request as draft April 20, 2024 07:03
@robot-ci-heartex
Copy link
Collaborator

@matt-bernstein matt-bernstein marked this pull request as ready for review April 22, 2024 18:14
@robot-ci-heartex
Copy link
Collaborator

Copy link
Contributor

@pakelley pakelley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got a little sidetracked, but posting what I have so far. Will circle back in a bit to finish up here! (sorry)

adala/environments/kafka.py Show resolved Hide resolved
pyproject.toml Outdated
@@ -35,6 +35,7 @@ fastapi = "^0.104.1"
celery = {version = "^5.3.6", extras = ["redis"]}
uvicorn = "*"
pydantic-settings = "^2.2.1"
label-studio-sdk = "^0.0.32"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we land on making this an optional dependency?
I know poetry doesn't directly support this, but imo it's probably better to document that this dep is necessary if you want to use the LSE result handler (and then we can include it in our deployment, but not force users to have it to use adala)

Copy link
Contributor Author

@matt-bernstein matt-bernstein Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from comment above:

# these are for the server
# they would be installed as `extras` if poetry supported version strings for extras, but it doesn't
# https://github.com/python-poetry/poetry/issues/834
# they also can't be installed as a `group`, because those are for dev dependencies only and could not be included if this package was pip-installed

Are we ok with everyone (including external users/contributors) needing to clone this repo in order to use the server in stead of pip installing adala? I could see it going either way.

Copy link
Contributor

@pakelley pakelley Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that the server dependencies are questionable here too, and our deps are in a questionable state no matte what, but this one at least seems to be stemming from a different issue:

  • the server deps should only be required by users of the server. These be in their own section (or repo/package), and the fix for that would eventually be breaking the server out into it's own repo/package
  • the LSE SDK dep should be an optional dep even for users of the server, and even if/when we break that into its own repo, we'll still want it to be separate. (Granted, we'd still want to use extras from poetry, which we won't be able to. But since it's currently only one additional dep to install for the LSE handler, I think it's preferable to just document that you need to have label-studio-sdk installed to use the handler than it is to force anyone who wants the server to have label-studio-sdk installed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you mean - can do something like put the label_studio_sdk import inside this class, catch and surface the error from it not being available, and update our deployment setup to manually add the dep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, not sure what needs to be done to install adala like this in deployment env:
poetry install --with label-studio

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to tell plate about it probably as well update our own dockerfile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to dockerfile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@farioas we're adding the label studio SDK as an optional dep to Adala, is there anything that needs to be done manually here for our dev instance of adala to keep working? The dockerfile is already updated in this branch.

server/app.py Outdated Show resolved Hide resolved
server/app.py Show resolved Hide resolved
server/app.py Show resolved Hide resolved
server/app.py Outdated Show resolved Hide resolved
server/app.py Show resolved Hide resolved
server/handlers/result_handlers.py Show resolved Hide resolved
server/handlers/result_handlers.py Show resolved Hide resolved
server/handlers/result_handlers.py Show resolved Hide resolved
server/handlers/result_handlers.py Show resolved Hide resolved
server/handlers/result_handlers.py Outdated Show resolved Hide resolved
server/tasks/process_file.py Outdated Show resolved Hide resolved
server/tasks/process_file.py Show resolved Hide resolved
server/tasks/process_file.py Outdated Show resolved Hide resolved
server/utils.py Outdated Show resolved Hide resolved
@robot-ci-heartex robot-ci-heartex marked this pull request as draft April 23, 2024 05:03
@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex robot-ci-heartex marked this pull request as ready for review April 23, 2024 15:01
@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@robot-ci-heartex
Copy link
Collaborator

@matt-bernstein matt-bernstein merged commit 589f868 into master Apr 23, 2024
13 of 14 checks passed
@matt-bernstein matt-bernstein deleted the fb-dia-953/stream-results branch April 23, 2024 23:46
@robot-ci-heartex
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants