Tuning Automation

To address the unfairness that arises from the tuning procedure, we implemented a tuning automation in DeepOBS. Here, we describe how to use it. We also provide some basic functionalities to monitor the tuning process. These are not explained here, but can be found in the API section of the Tuner. We further describe a comperative and fair usage of the tuning automation in the Suggested Protocol.

We provide three different Tuner classes: GridSearch, RandomSearch and GP (which is a Bayesian optimization method with a Gaussian Process surrogate). You can find detailed information about them in the API section Tuner. We will show all examples in this section for the PyTorch framework.

Bayesian Optimization (GP)

The Bayesian optimization method with a Gaussian process surrogate is more complex. At first, you have to specify the bounds of the suggestions. Additionally, you can set the transformation of the search space. In combination with the bounds, this can be used for a rescaling of the kernel or for optimization of discrete values:

from deepobs.tuner import GP
from torch.optim import SGD
from sklearn.gaussian_process.kernels import Matern
from deepobs import config
from deepobs.pytorch.runners import StandardRunner

optimizer_class = SGD
hyperparams = {"lr": {"type": float},
               "momentum": {"type": float},
               "nesterov": {"type": bool}}

# The bounds for the suggestions
bounds = {'lr': (-5, 2),
        'momentum': (0.5, 1),
        'nesterov': (0, 1)}


# Corresponds to rescaling the kernel in log space.
def lr_transform(lr):
    return 10**lr


# Nesterov is discrete but will be suggested continious.
def nesterov_transform(nesterov):
    return bool(round(nesterov))


# The transformations of the search space. Momentum does not need a transformation.
transformations = {'lr': lr_transform,
                   'nesterov': nesterov_transform}

tuner = GP(optimizer_class, hyperparams, bounds, runner=StandardRunner, ressources=36, transformations=transformations)

# Tune with a Matern kernel and rerun the best setting with 10 different seeds.
tuner.tune('quadratic_deep', kernel=Matern(nu=2.5), rerun_best_setting=True, num_epochs=2, output_dir='./gp_tuner')

You can download this example and use it as a template. Since Bayesian optimization is sequential by nature, we do not offer a parallelized version of it.