A dependency-aware Docker container orchestrator for Saltbox, written in Go.
Saltbox Docker Controller manages Docker container startup and shutdown based on dependency labels. It ensures containers start in the correct order, waits for health checks, and handles graceful shutdown in reverse dependency order.
Features:
- Dependency-aware orchestration using Docker labels
- Topological sort for optimal startup/shutdown order
- Parallel execution of independent containers
- Health check polling (60 second timeout per container)
- Job-based API for async operations
- Block/unblock operations for maintenance windows
- REST API server mode
- Helper mode for Docker daemon lifecycle integration
- Graceful shutdown handling
- Comprehensive test suite (80 tests)
Build using the Makefile:
make buildThe binary will be created at build/sdc.
For all available targets:
make helpRun all tests:
make testRun tests with coverage:
make test-coverageStart the REST API server:
./build/sdc server --host 127.0.0.1 --port 3377Or use the Makefile:
make run-serverRun the helper daemon for automatic lifecycle management:
./build/sdc helper --controller-url http://127.0.0.1:3377Or use the Makefile:
make run-helper./build/sdc --versionsdc/
├── cmd/controller/ # Main entry point (server/helper commands)
├── internal/
│ ├── api/ # HTTP handlers, middleware, and router
│ ├── client/ # HTTP client for helper mode
│ ├── config/ # Configuration management
│ ├── docker/ # Docker client wrapper and label parsing
│ ├── graph/ # Dependency graph and topological sort
│ ├── jobs/ # Job manager with worker pool
│ └── orchestrator/ # Container orchestration engine
└── pkg/logger/ # Structured logging (Zap)
- Label-based dependencies: Containers declare dependencies via
com.github.saltbox.depends_onlabels - Graph building: Saltbox Docker Controller builds a dependency graph from all running containers
- Topological sort: Determines optimal startup/shutdown order
- Batch execution: Independent containers in each batch start/stop in parallel
- Health checking: Polls container health status before proceeding to dependents
- Job tracking: All operations are tracked as jobs with UUID and status
Saltbox Docker Controller uses Saltbox Docker labels to define container management and dependencies:
labels:
com.github.saltbox.saltbox_managed: "true" # Required: Enable SDC management
com.github.saltbox.saltbox_controller: "true" # Optional: Enable/disable controller (default: true)
com.github.saltbox.depends_on: "postgres,redis" # Optional: Comma-separated dependencies
com.github.saltbox.depends_on.delay: "5" # Optional: Startup delay in seconds
com.github.saltbox.depends_on.healthchecks: "true" # Optional: Wait for healthchecks (default: false)Example docker-compose.yml:
services:
postgres:
image: postgres:15
labels:
com.github.saltbox.saltbox_managed: "true"
com.github.saltbox.depends_on.delay: "10"
redis:
image: redis:7
labels:
com.github.saltbox.saltbox_managed: "true"
com.github.saltbox.depends_on.delay: "5"
app:
image: myapp:latest
labels:
com.github.saltbox.saltbox_managed: "true"
com.github.saltbox.depends_on: "postgres,redis"
com.github.saltbox.depends_on.delay: "2"
com.github.saltbox.depends_on.healthchecks: "true"Saltbox Docker Controller will ensure postgres and redis start first (in parallel), wait for health checks and startup delays, then start app.
POST /start- Start containers in dependency order- Query params:
?timeout=600(optional, default: 600) - Response:
{"job_id": "uuid"} - Returns HTTP 503 if operations are blocked
- Query params:
POST /stop- Stop containers in reverse dependency order- Query params:
?timeout=300&ignore=container1&ignore=container2(optional, default timeout: 300) - Response:
{"job_id": "uuid"} - Returns HTTP 503 if operations are blocked
- Query params:
POST /block/{duration}- Block start/stop operations temporarilydurationparameter in minutes (default: 10)- Response:
{"message": "Operations are now blocked for N minutes"} - Auto-unblocks after the specified duration
POST /unblock- Manually unblock operations- Response:
{"message": "Operations are now unblocked"}
- Response:
GET /job_status/{job_id}- Get job details and status- Response: Full job object with status, results, and timing information
- Returns
{"status": "not_found"}with HTTP 404 if job doesn't exist
GET /ping- Health check endpoint- Response:
{"status": "healthy"}
- Response:
The helper mode is designed to run as a systemd service for automatic container lifecycle management:
Lifecycle:
- Wait for controller server to become ready (60 second timeout)
- Apply configured startup delay (default: 5 seconds)
- Submit start job for all managed containers
- Wait for job completion
- Run until receiving SIGTERM/SIGINT
- Submit stop job for all containers
- Wait for stop completion and exit gracefully
Blocked Operations Handling: When start/stop operations are blocked (HTTP 503 response), the helper will:
- Log an INFO message: "Container start/stop operation is currently blocked, skipping"
- Continue running without failing
- This allows the helper to gracefully handle maintenance windows
Options:
./build/sdc helper \
--controller-url http://127.0.0.1:3377 \ # Controller API URL (default: http://127.0.0.1:3377)
--startup-delay 5s \ # Delay before starting containers (default: 5s)
--timeout 600 \ # Job timeout in seconds (default: 600)
--poll-interval 5s # Status polling interval (default: 5s)GNU General Public License v3.0