Learning Rate Schedule Runner

class deepobs.pytorch.runners.LearningRateScheduleRunner(optimizer_class, hyperparameter_names)[source]

Bases: deepobs.pytorch.runners.runner.PTRunner

A runner for learning rate schedules. Can run a normal training loop with fixed hyperparams or a learning rate schedule. It should be used as a template to implement custom runners.

__init__(optimizer_class, hyperparameter_names)[source]

Creates a new Runner instance

Parameters:
  • optimizer_class -- The optimizer class of the optimizer that is run on the testproblems. For PyTorch this must be a subclass of torch.optim.Optimizer. For TensorFlow a subclass of tf.train.Optimizer.
  • hyperparameter_names -- A nested dictionary that lists all hyperparameters of the optimizer, their type and their default values (if they have any).

Example

>>> optimizer_class = tf.train.MomentumOptimizer
>>> hyperparms = {'lr': {'type': float},
>>>    'momentum': {'type': float, 'default': 0.99},
>>>    'uses_nesterov': {'type': bool, 'default': False}}
>>> runner = StandardRunner(optimizer_class, hyperparms)
static create_testproblem(testproblem, batch_size, weight_decay, random_seed)

Sets up the deepobs.pytorch.testproblems.testproblem instance.

Parameters:
  • testproblem (str) -- The name of the testproblem.
  • batch_size (int) -- Batch size that is used for training
  • weight_decay (float) -- Regularization factor
  • random_seed (int) -- The random seed of the framework
Returns:

An instance of deepobs.pytorch.testproblems.testproblem

Return type:

deepobs.pytorch.testproblems.testproblem

static evaluate(tproblem, phase)

Evaluates the performance of the current state of the model of the testproblem instance. Has to be called in the beggining of every epoch within the training method. Returns the losses and accuracies.

Parameters:
  • tproblem (testproblem) -- The testproblem instance to evaluate
  • phase (str) -- The phase of the evaluation. Must be one of 'TRAIN', 'VALID' or 'TEST'
Returns:

The loss of the current state. float: The accuracy of the current state.

Return type:

float

parse_args(testproblem, hyperparams, batch_size, num_epochs, random_seed, data_dir, output_dir, weight_decay, no_logs, train_log_interval, print_train_iter, tb_log, tb_log_dir, training_params)

Constructs an argparse.ArgumentParser and parses the arguments from command line.

Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • weight_decay (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • training_params (dict) -- Kwargs for the training method.
Returns:

A dicionary of all arguments.

Return type:

dict

run(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, weight_decay=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, skip_if_exists=False, **training_params)
Runs a testproblem with the optimizer_class. Has the following tasks:
  1. setup testproblem
  2. run the training (must be implemented by subclass)
  3. merge and write output
Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • weight_decay (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • skip_if_exists (bool) -- Skip training if the output already exists.
  • training_params (dict) -- Kwargs for the training method.
Returns:

{<...meta data...>, 'test_losses' : test_losses, 'valid_losses': valid_losses 'train_losses': train_losses, 'test_accuracies': test_accuracies, 'valid_accuracies': valid_accuracies 'train_accuracies': train_accuracies, } where <...meta data...> stores the run args.

Return type:

dict

run_exists(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, weight_decay=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, **training_params)

Return whether output file for this run already exists.

Parameters:run method. (See) --
Returns:The first parameter is True if the .json output file already exists, else False. The list contains the paths to the files that match the run.
Return type:bool, list(str)
training(tproblem, hyperparams, num_epochs, print_train_iter, train_log_interval, tb_log, tb_log_dir, lr_sched_epochs=None, lr_sched_factors=None)[source]

Performs the training and stores the metrices.

Parameters:
  • tproblem (deepobs.[tensorflow/pytorch]testproblems.testproblem) -- The testproblem instance to train on.
  • hyperparams (dict) -- The optimizer hyperparameters to use for the training.
  • num_epochs (int) -- The number of training epochs.
  • print_train_iter (bool) -- Whether to print the training progress at every train_log_interval
  • train_log_interval (int) -- Mini-batch interval for logging.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • lr_sched_epochs (list) -- The epochs where to adjust the learning rate.
  • lr_sched_factors (list) -- The corresponding factors by which to adjust the learning rate.
Returns:

The logged metrices. Is of the form: {'test_losses' : [...], 'valid_losses': [...], 'train_losses': [...], 'test_accuracies': [...], 'valid_accuracies': [...], 'train_accuracies': [...] } where the metrices values are lists that were filled during training.

Return type:

dict

static write_output(output, run_folder_name, file_name)

Writes the JSON output.

Parameters:
  • output (dict) -- Output of the training loop of the runner.
  • run_folder_name (str) -- The name of the output folder.
  • file_name (str) -- The file name where the output is written to.