Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to wait for an environment #1957

Open
wants to merge 80 commits into
base: main
Choose a base branch
from
Open

Add ability to wait for an environment #1957

wants to merge 80 commits into from

Conversation

8W9aG
Copy link
Contributor

@8W9aG 8W9aG commented Sep 16, 2024

  • Add a wait.py containing utilities that allow for waiting for specific files to appear, and for loading imports. These are defined by the environment variables COG_WAIT_FILE and COG_EAGER_IMPORTS (CSV). If neither of these environment variables exist this is a no-op.
  • Separate out BasePredictor and BaseInput into their own class files now that they are accessed by multiple consumers (config and predictor).
  • Create a Config class which is an abstraction around cog.yaml allowing for environment variables to be used as a substitute for certain attributes of the cog.yaml, in theory allowing us to load the cog environment without relying on this file provided we have set the right environment variables.
  • Change create_app to take a Config class rather that a dictionary.
  • Add an env_property function decorator for reading from an environment variable before dropping into its wrapped function. This prevents the need for loading cog.yaml if the property is guarded by this.
  • Add a Mode enumeration that delineates between the different modes of predict and train in a type safe way.
  • Wait for the environment to setup right before the worker process calls setup.
  • Change the test stubs to use the Config class while injecting in some mock values if applicable.
  • Add some tests for code_xform to make sense of its inputs and outputs.
  • Increase test timeout to 20s, this is due to the wait tests using up the previous time allotment of 10s.

This allows cog to begin running and do as much work as possible with as little information as possible while it waits for a file system to appear around it. Currently cog requires cog.yaml and the prediction/training python to appear before setting up the HTTP client, in this case we can setup the HTTP client while we load these files in the background and get a signal that the file loading is finished and continue loading the predictors setup, in addition to this we perform any pre-emptive work we can while we are waiting for these files to load (such as eagerly importing modules).

* This allows us not to read the config for the
app threads until absolutely necessary.
* Delay loading the cog yaml until it is read from
in the code.
* Allow environment variables to control whether
we wait for a file to appear before further
processing
* This allows the python interpreter to boot up
while we wait for other files to become available
* Allow waiting for a general environment by
importing select python packages while the system
boots up around cog.
* I shouldn’t need to do this but want to see if
it relieves the errors on linux.
* A small test to make sense of what the code
stripper is doing.
* Create a class for accessing cog config
* Only access variables from the config by
properties
* Gate those properties with environment variable
function decorators to allow fetching the config
from the environment rather than a file.
* This allows the environment to begin running
without needing the /src
* While waiting for the environment to boot, load
The designated imports to speed up interpreter
Time.
 * In strip model source code we use an AST
function that isn’t available prior to 3.9
* We aren’t waiting for these imports, we are
loading them eagerly while we wait.
* We have so many that we need to increase this.
* Fix an issue with env_property where it could
not handle Optional or Union
* Create these functions because they don’t exist
in python 3.7
* Do not wait for a signal, just use a while loop
with a 10 ms interval to check for presence of
file.
@8W9aG 8W9aG mentioned this pull request Oct 15, 2024
* Ensure that train/predictor stripped code is
kept separate in the event a user uses separate
files to refer to these different modes.
* Old ones used to be many overloaded concepts.
* We can’t just update the system path, we also
need to change the PYTHONPATH environment variable
* This allows forked processes to see this
environment variable and act accordingly in user
code.
* Some tests previously checked the debug logging
* Make sure we conform to the same debug logging
* Confirm that this behaviour is consistent with
functions.
* Allows better debugging of stderr et al
* Currently predict assumes that PredictionResponse
is the serialisation target.
* This isn’t the case if we are calling _predict
from a training endpoint
* Allow response_type to be fed into the _predict
method to inform it of what kind of response it
should expect.
* Use the proper endpoints to call training
functions.
* Log the command properly to the user.
@8W9aG 8W9aG mentioned this pull request Oct 23, 2024
@8W9aG 8W9aG force-pushed the add-waiting-env branch 2 times, most recently from 5cf3ace to 2840fc7 Compare November 1, 2024 19:24
@8W9aG 8W9aG mentioned this pull request Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants