Tolstoi Data Set

class deepobs.tensorflow.datasets.tolstoi.tolstoi(batch_size, seq_length=50, train_eval_size=653237)[source]

DeepOBS data set class for character prediction on War and Peace by Leo Tolstoi.

Parameters:
  • batch_size (int) -- The mini-batch size to use. Note that, if batch_size is not a divider of the dataset size the remainder is dropped in each epoch (after shuffling).
  • seq_length (int) -- Sequence length to be modeled in each step. Defaults to 50.
  • train_eval_size (int) -- Size of the train eval dataset. Defaults to 653 237, the size of the test set.
batch

A tuple (x, y) of tensors, yielding batches of tolstoi data (x with shape (batch_size, seq_length)) and (y with shape (batch_size, seq_length) which is x shifted by one). Executing these tensors raises a tf.errors.OutOfRangeError after one epoch.

train_init_op

A tensorflow operation initializing the dataset for the training phase.

train_eval_init_op

A tensorflow operation initializing the testproblem for evaluating on training data.

test_init_op

A tensorflow operation initializing the testproblem for evaluating on test data.

phase

A string-value tf.Variable that is set to train, train_eval or test, depending on the current phase. This can be used by testproblems to adapt their behavior to this phase.