TF Runner

The base class for all TensorFlow Runner.

class deepobs.tensorflow.runners.TFRunner(optimizer_class, hyperparameter_names)[source]

Bases: deepobs.abstract_runner.abstract_runner.Runner

__init__(optimizer_class, hyperparameter_names)[source]

Creates a new Runner instance

Parameters:
  • optimizer_class -- The optimizer class of the optimizer that is run on the testproblems. For PyTorch this must be a subclass of torch.optim.Optimizer. For TensorFlow a subclass of tf.train.Optimizer.
  • hyperparameter_names -- A nested dictionary that lists all hyperparameters of the optimizer, their type and their default values (if they have any).

Example

>>> optimizer_class = tf.train.MomentumOptimizer
>>> hyperparms = {'lr': {'type': float},
>>>    'momentum': {'type': float, 'default': 0.99},
>>>    'uses_nesterov': {'type': bool, 'default': False}}
>>> runner = StandardRunner(optimizer_class, hyperparms)
static create_testproblem(testproblem, batch_size, weight_decay, random_seed)[source]

Sets up the deepobs.tensorflow.testproblems.testproblem instance.

Parameters:
  • testproblem (str) -- The name of the testproblem.
  • batch_size (int) -- Batch size that is used for training
  • weight_decay (float) -- Regularization factor
  • random_seed (int) -- The random seed of the framework
Returns:

An instance of deepobs.pytorch.testproblems.testproblem

Return type:

deepobs.tensorflow.testproblems.testproblem

static evaluate(tproblem, sess, loss, phase)[source]

Computes average loss and accuracy in the evaluation phase. :param tproblem: The testproblem instance. :type tproblem: deepobs.tensorflow.testproblems.testproblem :param sess: The current TensorFlow Session. :type sess: tensorflow.Session :param loss: The TensorFlow operation that computes the loss. :param phase: The phase of the evaluation. Muste be one of 'TRAIN', 'VALID' or 'TEST' :type phase: str

static init_summary(loss, learning_rate_var, batch_size, tb_log_dir)[source]

Initializes the tensorboard summaries

parse_args(testproblem, hyperparams, batch_size, num_epochs, random_seed, data_dir, output_dir, weight_decay, no_logs, train_log_interval, print_train_iter, tb_log, tb_log_dir, training_params)

Constructs an argparse.ArgumentParser and parses the arguments from command line.

Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • weight_decay (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • training_params (dict) -- Kwargs for the training method.
Returns:

A dicionary of all arguments.

Return type:

dict

run(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, weight_decay=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, skip_if_exists=False, **training_params)
Runs a testproblem with the optimizer_class. Has the following tasks:
  1. setup testproblem
  2. run the training (must be implemented by subclass)
  3. merge and write output
Parameters:
  • testproblem (str) -- Name of the testproblem.
  • hyperparams (dict) -- The explizit values of the hyperparameters of the optimizer that are used for training
  • batch_size (int) -- Mini-batch size for the training data.
  • num_epochs (int) -- The number of training epochs.
  • random_seed (int) -- The torch random seed.
  • data_dir (str) -- The path where the data is stored.
  • output_dir (str) -- Path of the folder where the results are written to.
  • weight_decay (float) -- Regularization factor for the testproblem.
  • no_logs (bool) -- Whether to write the output or not.
  • train_log_interval (int) -- Mini-batch interval for logging.
  • print_train_iter (bool) -- Whether to print the training progress at each train_log_interval.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • skip_if_exists (bool) -- Skip training if the output already exists.
  • training_params (dict) -- Kwargs for the training method.
Returns:

{<...meta data...>, 'test_losses' : test_losses, 'valid_losses': valid_losses 'train_losses': train_losses, 'test_accuracies': test_accuracies, 'valid_accuracies': valid_accuracies 'train_accuracies': train_accuracies, } where <...meta data...> stores the run args.

Return type:

dict

run_exists(testproblem=None, hyperparams=None, batch_size=None, num_epochs=None, random_seed=None, data_dir=None, output_dir=None, weight_decay=None, no_logs=None, train_log_interval=None, print_train_iter=None, tb_log=None, tb_log_dir=None, **training_params)

Return whether output file for this run already exists.

Parameters:run method. (See) --
Returns:The first parameter is True if the .json output file already exists, else False. The list contains the paths to the files that match the run.
Return type:bool, list(str)
training(tproblem, hyperparams, num_epochs, print_train_iter, train_log_interval, tb_log, tb_log_dir, **training_params)[source]

Performs the training and stores the metrices.

Parameters:
  • tproblem (deepobs.[tensorflow/pytorch]testproblems.testproblem) -- The testproblem instance to train on.
  • hyperparams (dict) -- The optimizer hyperparameters to use for the training.
  • num_epochs (int) -- The number of training epochs.
  • print_train_iter (bool) -- Whether to print the training progress at every train_log_interval
  • train_log_interval (int) -- Mini-batch interval for logging.
  • tb_log (bool) -- Whether to use tensorboard logging or not
  • tb_log_dir (str) -- The path where to save tensorboard events.
  • **training_params (dict) -- Kwargs for additional training parameters that are implemented by subclass.
Returns:

The logged metrices. Is of the form: {'test_losses' : [...], 'valid_losses': [...], 'train_losses': [...], 'test_accuracies': [...], 'valid_accuracies': [...], 'train_accuracies': [...] } where the metrices values are lists that were filled during training.

Return type:

dict

static write_output(output, run_folder_name, file_name)

Writes the JSON output.

Parameters:
  • output (dict) -- Output of the training loop of the runner.
  • run_folder_name (str) -- The name of the output folder.
  • file_name (str) -- The file name where the output is written to.
static write_per_epoch_summary(sess, loss_, acc_, current_step, per_epoch_summaries, summary_writer, phase)[source]

Writes the tensorboard epoch summary

static write_per_iter_summary(sess, per_iter_summaries, summary_writer, current_step)[source]

Writes the tensorboard iteration summary