Learning Rate Schedule Runner

Using the learning rate schedule runner adds two more training parameters to the training loop, the epochs and factors for the learning rate decay. The example below shows how to use it in a run file, but these parameters are also automatically added to be command line arguments.

optimizer_class = tf.train.MomentumOptimizer
hyperparms = {'lr': {'type': float},
      'momentum': {'type': float, 'default': 0.99},
      'uses_nesterov': {'type': bool, 'default': False}}
schedule = {
          "name": "step",
          "lr_sched_epochs": [2, 4],
          "lr_sched_factors": [0.1, 0.01]
      }
runner = tfobs.runners.LearningRateScheduleRunner(optimizer_class, hyperparams)
runner.run(testproblem='quadratic_deep', hyperparams={'learning_rate': 1e-2}, num_epochs=10, lr_sched_epochs=schedule["lr_sched_epochs"], lr_sched_factors=schedule["lr_sched_factors"])
class deepobs.tensorflow.runners.LearningRateScheduleRunner(optimizer_class, hyperparameter_names)[source]

Bases: deepobs.tensorflow.runners.runner.TFRunner

__init__(optimizer_class, hyperparameter_names)[source]

Creates a new Runner instance

Parameters:
  • optimizer_class -- The optimizer class of the optimizer that is run on the testproblems. For PyTorch this must be a subclass of torch.optim.Optimizer. For TensorFlow a subclass of tf.train.Optimizer.
  • hyperparameter_names -- A nested dictionary that lists all hyperparameters of the optimizer, their type and their default values (if they have any).

Example

>>> optimizer_class = tf.train.MomentumOptimizer
>>> hyperparms = {'lr': {'type': float},
>>>    'momentum': {'type': float, 'default': 0.99},
>>>    'uses_nesterov': {'type': bool, 'default': False}}
>>> runner = StandardRunner(optimizer_class, hyperparms)
static create_testproblem(testproblem, batch_size, l2_reg, random_seed)

Sets up the deepobs.tensorflow.testproblems.testproblem instance.

Parameters:
  • testproblem (str) -- The name of the testproblem.
  • batch_size (int) -- Batch size that is used for training
  • l2_reg (float) -- Regularization factor
  • random_seed (int) -- The random seed of the framework
Returns:

An instance of deepobs.pytorch.testproblems.testproblem

Return type:

deepobs.tensorflow.testproblems.testproblem

static evaluate(tproblem, sess, loss, phase)

Computes average loss and accuracy in the evaluation phase. :param tproblem: The testproblem instance. :type tproblem: deepobs.tensorflow.testproblems.testproblem :param sess: The current TensorFlow Session. :type sess: tensorflow.Session :param loss: The TensorFlow operation that computes the loss. :param phase: The phase of the evaluation. Muste be one of 'TRAIN', 'VALID' or 'TEST' :type phase: str

static init_summary(loss, learning_rate_var, batch_size, tb_log_dir)

Initializes the tensorboard summaries

parse_args(testproblem, hyperparams, batch_size, num_epochs, random_seed, data_dir, output_dir, l2_reg, no_logs, train_log_interval, print_train_iter, tb_log, tb_log_dir, training_params)

Constructs an argparse.ArgumentParser and parses the arguments from command line.

Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • l2_reg (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • training_params (dict) -- Kwargs for the training method.
Returns:

A dicionary of all arguments.

Return type:

dict

run(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, l2_reg=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, skip_if_exists=False, **training_params)
Runs a testproblem with the optimizer_class. Has the following tasks:
  1. setup testproblem
  2. run the training (must be implemented by subclass)
  3. merge and write output
Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • l2_reg (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • skip_if_exists (bool) -- Skip training if the output already exists.
  • training_params (dict) -- Kwargs for the training method.
Returns:

{<...meta data...>, 'test_losses' : test_losses, 'valid_losses': valid_losses 'train_losses': train_losses, 'test_accuracies': test_accuracies, 'valid_accuracies': valid_accuracies 'train_accuracies': train_accuracies, } where <...meta data...> stores the run args.

Return type:

dict

run_exists(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, l2_reg=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, **training_params)

Return whether output file for this run already exists.

Parameters:run method. (See) --
Returns:The first parameter is True if the .json output file already exists, else False. The list contains the paths to the files that match the run.
Return type:bool, list(str)
training(tproblem, hyperparams, num_epochs, print_train_iter, train_log_interval, tb_log, tb_log_dir, lr_sched_epochs=None, lr_sched_factors=None)[source]

Performs the training and stores the metrices.

Parameters:
  • tproblem (deepobs.[tensorflow/pytorch]testproblems.testproblem) -- The testproblem instance to train on.
  • hyperparams (dict) -- The optimizer hyperparameters to use for the training.
  • num_epochs (int) -- The number of training epochs.
  • print_train_iter (bool) -- Whether to print the training progress at every train_log_interval
  • train_log_interval (int) -- Mini-batch interval for logging.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • lr_sched_epochs (list) -- The epochs where to adjust the learning rate.
  • lr_sched_factors (list) -- The corresponding factors by which to adjust the learning rate.
Returns:

The logged metrices. Is of the form: {'test_losses' : [...], 'valid_losses': [...], 'train_losses': [...], 'test_accuracies': [...], 'valid_accuracies': [...], 'train_accuracies': [...] } where the metrices values are lists that were filled during training.

Return type:

dict

write_output(output)

Writes the JSON output.

Parameters:
  • output (dict) -- Output of the training loop of the runner.
  • run_folder_name (str) -- The name of the output folder.
  • file_name (str) -- The file name where the output is written to.
static write_per_epoch_summary(sess, loss_, acc_, current_step, per_epoch_summaries, summary_writer, phase)

Writes the tensorboard epoch summary

static write_per_iter_summary(sess, per_iter_summaries, summary_writer, current_step)

Writes the tensorboard iteration summary