-
Notifications
You must be signed in to change notification settings - Fork 312
server side
The OtterTune server is responsible for processing and storing tuning data, scheduling jobs to compute OtterTune’s ML models and make configuration recommendations, and visualizing the results from each tuning session in its front-end web interface. The tuning manager is written in Python using the Django web framework. We use MySQL database for Django back-end database. We use Celery to schedule and execute tasks for creating OtterTune’s ML models and recommending new configurations. Celery is a task queue and scheduler that is easy to integrate with web frameworks like Django. We implemented all of OtterTune’s ML models using Python’s scikit-learn and Google TensorFlow.
The server receives a tuning task together with the collected data (like knobs and metrics) from clients. It parses the data and stores them in the data repository. Then Celery will schedule and run ML tasks to recommend new configuration for clients.
OtterTune first passes the data to the Workload Characterization component. This component identifies a smaller set of DBMS metrics that best capture the variability in performance and the distinguishing characteristics for different workloads. Next, the Knob Identification component generates a ranked list of the knobs that most affect the DBMS’s performance. OtterTune then feeds all of this information to the Automatic Tuner. This last component maps the target DBMS’s workload to the most similar workload in its data repository, and reuses this workload data to generate better configurations.
Workload Characterization and Knob Identification are periodical tasks which are executed periodically (e.g. every 20 mins), Their results (ranked knobs and non-redundant metrics) may change because the training data may grow every time period. Automatic Recommender step is executed every time when it receives a tuning task to recommend configuration for clients.
To start these ML tasks, you may want to start a message broker (like RabbitMq) required by Celery, and also run Celery worker.
sudo rabbitmq-server -detached
python3 manage.py celery worker --loglevel=info --pool=threads
We use Celery beat to run periodical tasks.
python3 manage.py celerybeat --verbosity=2 --loglevel=info
The Celery will schedule an automatic recommendation task to run when it receives a tuning task. In the Automatic Recommendation step, it needs the ranked knobs and non-redundant metrics from Workload Characterization and Knob Identification. So make sure you have run these periodical tasks before using Celery beat.