OptiServe is a system for jointly optimizing cost, latency, and accuracy in serverless applications with machine learning workloads. It supports complex application workflows composed of multiple functions, each with different performance and accuracy characteristics, and finds configurations that satisfy application-level constraints.
Serverless computing simplifies deployment, but makes it harder to tune performance. OptiServe tackles this challenge by:
- Modeling latency and cost for both ML and non-ML functions.
- Capturing the impact of model accuracy on end-to-end workflow behavior.
- Solving tri-objective optimization problems using graph-based heuristics.
- Automatically identifying optimal memory and model choices for each function in a workflow.
- Tri-objective optimization of serverless workflows (cost, latency, accuracy).
- Performance modeling through lightweight profiling.
- Search space reduction using critical paths and benefit-cost heuristics.
- Support for workflows with branching, parallelism, cycles, and self-loops.
Create a .env file in the root direcoty based on .env.example template and enter you AWS credentials.
cp .env.example .env
vi .envWe used Python 3.11.13 to develop and test OptiServe. You can install the dependencies using either conda or pip. Make sure you're using Python 3.11 if installing manually with pip.
- Clone the project and move into the root directory:
git clone https://github.com/pacslab/optiserve.git
cd optiserve- Install dependencies:
Option A: Using Conda
conda env create -f environment.yml
conda activate optiserveOption B: Using pip
python -m pip install -r requirements.txtTo see how OptiServe works and how to apply it to your own workflows, please check the Jupyter notebooks in the experiments directory.
