Tools for performing hyperparameter search with Scikit-Learn and Dask.
- Drop-in replacement for Scikit-Learn's
GridSearchCV
andRandomizedSearchCV
. - Hyperparameter optimization can be done in parallel using threads, processes, or distributed across a cluster.
- Works well with Dask collections. Dask arrays, dataframes, and delayed can be
passed to
fit
. - Candidate estimators with identical parameters and inputs will only be fit
once. For composite-estimators such as
Pipeline
this can be significantly more efficient as it can avoid expensive repeated computations.
For more information, check out the documentation.
Dask-searchcv is available via conda
or pip
:
# Install with conda $ conda install dask-searchcv -c conda-forge # Install with pip $ pip install dask-searchcv
from sklearn.datasets import load_digits
from sklearn.svm import SVC
import dask_searchcv as dcv
import numpy as np
digits = load_digits()
param_space = {'C': np.logspace(-4, 4, 9),
'gamma': np.logspace(-4, 4, 9),
'class_weight': [None, 'balanced']}
model = SVC(kernel='rbf')
search = dcv.GridSearchCV(model, param_space, cv=3)
search.fit(digits.data, digits.target)