A serverless cluster manager built by Systems Groups at ETH Zürich
Dirigent is a lightweight cluster manager for FaaS that aims to solve performance issues of existing FaaS platforms. It is a clean-slate system architecture for FaaS orchestration based on three key principles. First, Dirigent optimizes internal cluster manager abstractions to simplify state management. Second, it eliminates persistent state updates on the critical path of function invocations, leveraging the fact that FaaS abstracts sandboxes from users to relax exact state reconstruction guarantees. Finally, Dirigent runs monolithic control and data planes to minimize internal communication overheads and maximize throughput. The architecture of Dirigent is shown on the picture below. Our performance study reveals that compared to current state-of-the-art platforms Dirigent reduces 99th percentile per-function scheduling latency for a production workload by 2.79x compared to AWS Lambda and can spin up 2500 sandboxes per second at low latency, which is 1250x more than with Knative.
See the README.md
to get started with the code.
The folder structure is as follow:
api
- proto files for Dirigent componentsartifact_evaluation
- instructions and material for SOSP'24 artifact evaluationcmd
- Dirigent components main methodsconfigs
- configuration files for external dependenciesinternal/master_node
- control plane source codeinternal/data_plane
- data plane source codeinternal/worker_node
- worker node source codepkg
- common packages of Dirigent componentsscripts
- auxiliary scriptsworkload
- workload we used for evaluation
You can download a copy of all the files in this repository by cloning the git repository:
git clone https://github.com/eth-easl/dirigent
To run the cluster manager locally the following setting must be enabled:
sudo sysctl -w net.ipv4.conf.all.route_localnet=1
Install HAProxy
sudo apt update && sudo apt install -y haproxy
sudo cp configs/haproxy.cfg /etc/haproxy/haproxy.cfg
kubernetes-cni must be installed
curl -L -o cni-plugins.tgz https://github.com/containernetworking/plugins/releases/download/v0.8.1/cni-plugins-linux-amd64-v0.8.1.tgz
sudo mkdir -p /opt/cni/bin
sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz
If you want to install it on a custom path
INSTALL_PATH='your/path/here'
curl -L -o cni-plugins.tgz https://github.com/containernetworking/plugins/releases/download/v0.8.1/cni-plugins-linux-amd64-v0.8.1.tgz
sudo mkdir -p /opt/cni/bin
sudo tar -C INSTALL_PATH -xzf cni-plugins.tgz
Prepare a Cloudlab cluster of at least 5 nodes. We tested our setup on xl170 and d430 nodes. Clone the repository locally, configure scripts/setup.cfg
and run the following script to deploy the cluster. The load generator will be deployed on node0, control plane with Redis on node1, data plane on node2, and the rest of the nodes will be used as worker nodes.
./scripts/remote_install.sh user@node0 user@node1 user@node2 user@node ...
After this setup, you run the following scripts to (re)start the cluster.
./scripts/remote_start_cluster.sh user@node0 user@node1 user@node2 user@node ...
We recommend using Invitro Load Generator on rps_mode branch for running experiments with Dirigent cluster manager. The load generator will be automatically cloned on node0 after running scripts/remote_install.sh
.
Start a Redis DB instance
sudo docker-compose up
Start the master node
cd cmd/master_node; sudo /usr/local/go/bin/go run main.go --config cmd/config.yaml
Start the data plane
cd cmd/data_plane; go run main.go --config cmd/config.yaml
Start the worker node
cd cmd/worker_node; sudo /usr/local/go/bin/go run main.go --config cmd/config.yaml
In case you get a timeout, try to run the following command and then repeat the experiment.
# For local readiness probes
sudo sysctl -w net.ipv4.conf.all.route_localnet=1
# For reachability of sandboxes from other cluster nodes
sudo sysctl -w net.ipv4.ip_forward=1
- Install Firecracker
ARCH="$(uname -m)"
release_url="https://github.com/firecracker-microvm/firecracker/releases"
latest=$(basename $(curl -fsSLI -o /dev/null -w %{url_effective} ${release_url}/latest))
curl -L ${release_url}/download/${latest}/firecracker-${latest}-${ARCH}.tgz \
| tar -xz
sudo mv release-${latest}-$(uname -m) /usr/local/bin/firecracker
sudo mv /usr/local/bin/firecracker/firecracker-${latest}-${ARCH} /usr/local/bin/firecracker/firecracker
sudo sh -c "echo 'export PATH=\$PATH:/usr/local/bin/firecracker' >> /etc/profile"
- Install tun-tap
git clone https://github.com/awslabs/tc-redirect-tap.git || true
make -C tc-redirect-tap
sudo cp tc-redirect-tap/tc-redirect-tap /opt/cni/bin
- Install ARP
sudo apt-get update && sudo apt-get install net-tools
- Download Kernel
sudo apt-get update && sudo apt-get install git-lfs
git lfs fetch
git lfs checkout
git lfs pull
- Run control plane and data plane processes. Run worker daemon with
sudo
and with hardcoded environmental variablePATH
to point to the directory where Firecracker is located.
sudo env 'PATH=\$PATH:/usr/local/bin/firecracker' /usr/local/go/bin/go run cmd/worker_node/main.go
sudo iptables -t nat -F
Distributed under the MIT License. See LICENSE
for more information.
Lazar Cvetković - [email protected]
François Costa - [email protected]
Ana Klimovic - [email protected]
First you have to install the protobuf compiler
make install_golang_proto_compiler
Then you can compile the proto types using the following command
make proto
First you have to install the mockgen library
make install_mockgen
Then you can create the files with the following command
make generate_mock_files
sudo go test -v ./...
golangci-lint run --fix
or with verbose
golangci-lint run -v --timeout 5m0s
@inproceedings{10.1145/3694715.3695966,
author = {Cvetkovi\'{c}, Lazar and Costa, Fran\c{c}ois and Djokic, Mihajlo and Friedman, Michal and Klimovic, Ana},
title = {Dirigent: Lightweight Serverless Orchestration},
year = {2024},
isbn = {9798400712517},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3694715.3695966},
doi = {10.1145/3694715.3695966},
booktitle = {Proceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles},
pages = {369–384},
numpages = {16},
location = {Austin, TX, USA},
series = {SOSP '24}
}