scalarstop.model

Wrappers that specify models trained on specific DataBlob instances.

Creating and training models

The purpose of a Model subclass instance–such as KerasModel–is to join together a DataBlob instance and ModelTemplate instance into a trained model.

It also manages saving and loading models to/from the filesystem and save hyperparameters and training metrics to the TrainStore.

The ScalarStop Tutorial demonstrates how to use ScalarStop when training real models on real data. Below is a brief sketch of how to load, save, and train models.

First, we subclass DataBlob and create an instance. This is where we store our training, validation, and test sets.

>>> import tensorflow as tf
>>> import scalarstop as sp
>>>
>>> class MyDataBlob(sp.DataBlob):
...
...     @sp.dataclass
...     class Hyperparams(sp.HyperparamsType):
...             cols: int
...
...     def _data(self):
...             x = tf.random.uniform(shape=(10, self.hyperparams.cols))
...             y = tf.round(tf.random.uniform(shape=(10,1)))
...             return tf.data.Dataset.zip((
...                     tf.data.Dataset.from_tensor_slices(x),
...                     tf.data.Dataset.from_tensor_slices(y),
...             ))
...
...     def set_training(self):
...         return self._data()
...
...     def set_validation(self):
...         return self._data()
...
...     def set_test(self):
...         return self._data()

And when we create an instance of our DataBlob subclass, we should batch it if we plan on training a model with it.

>>> datablob = MyDataBlob(hyperparams=dict(cols=3)).batch(2)

Then, we define the architecture of the model we want to train by subclassing ModelTemplate and creating an instance.

>>> class MyModelTemplate(sp.ModelTemplate):
...    @sp.dataclass
...
...    class Hyperparams(sp.HyperparamsType):
...        hidden_units: int
...        optimizer: str = "adam"
...
...    def new_model(self):
...        model = tf.keras.Sequential(
...            layers=[
...                tf.keras.layers.Dense(
...                    units=self.hyperparams.hidden_units,
...                    activation="relu",
...                ),
...                tf.keras.layers.Dense(
...                    units=1,
...                    activation="sigmoid"
...                ),
...           ],
...            name=self.name,
...        )
...        model.compile(
...            optimizer=self.hyperparams.optimizer,
...            loss="binary_crossentropy",
...            metrics=["accuracy"],
...        )
...        return model
>>> model_template = MyModelTemplate(hyperparams=dict(hidden_units=5))

Now we create a KerasModel instance that bridges together our DataBlob and ModelTemplate instances.

We’ll also pass a directory to models_directory. If we have a model saved in a subdirectory of models_directory, we’ll load that model instead of starting from scratch.

>>> import os
>>> import tempfile
>>> tempdir = tempfile.TemporaryDirectory()
>>>
>>> model = sp.KerasModel.from_filesystem_or_new(
...    datablob=datablob,
...    model_template=model_template,
...    models_directory=tempdir.name,
... )

Then you can call KerasModel.fit() to fit your new model using your DataBlob ‘s training and validation sets. We pass models_directory here again–this time to save our model in a subdirectory.

>>> history = model.fit(final_epoch=3, verbose=0, models_directory=tempdir.name)

You can call KerasModel.evalute() to evaluate your model against your DataBlob ‘s test set–or another tf.data.Dataset of your choosing.

>>> test_set_metrics = model.evaluate(verbose=0)

(And now we clean up the temporary directory from our example.) >>> tempdir.cleanup()

Using the TrainStore

If you pass a TrainStore to KerasModel.fit(), then the metrics generated while training will be saved to the Train Store’s database, along with the DataBlob and ModelTemplate names and hyperparameters.

Module Contents

Classes

Model

Abstract parent class for all ScalarStop models.

KerasModel

Trains tf.keras machine learning models generated by a ModelTemplate on the training and validation sets in a DataBlob.

Functions

latest_epoch_on_filesystem(model_path: str) → int

Returns the latest saved epoch number on the filesystem.

latest_epoch_on_filesystem(model_path: str) int

Returns the latest saved epoch number on the filesystem.

Raises

ModelNotFoundError – Raised when we cannot find the model. If you intend on subclassing from_filesystem(), make sure to raise this exception when you cannot find the model.

class Model(*, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, model: Optional[Any] = None)

Abstract parent class for all ScalarStop models.

classmethod from_filesystem(cls, *, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, models_directory: str, epoch_num: Optional[int] = None) Model

Load an already-trained model from the filesystem.

Parameters
  • datablob – The DataBlob or DistributedDataBlob used to train the model that we are looking for.

  • model_template – The ModelTemplate used to create the model that we are looking for.

  • models_directory – The directory where you store all of your pretrained models. This is the parent directory of a single pretrained model.

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Returns

A Model with weights and configuration from

the filesystem.

Raises

ModelNotFoundError – Raised when we cannot find the model. If you intend on subclassing from_filesystem(), make sure to raise this exception when you cannot find the model.

classmethod from_filesystem_or_new(cls, *, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, models_directory: str, epoch_num: Optional[int] = None) Model

Load a saved model from the filesystem. If we can’t find one, create a new one with the supplied ModelTemplate.

Parameters
  • datablob – The DataBlob or DistributedDataBlob that we will use to train the model.

  • model_template – The ModelTemplate that we will use to create the model.

  • models_directory – The directory where you store all of your pretrained models. This is the parent directory of a single pretrained model.

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Returns

A Model instance.

static calculate_name(model_template_name: str, datablob_name: str) str

Create a model name from a ModelTemplate name and a DataBlob name.

property name(self) str

This model’s name.

If you intend on overriding this method, you should make sure that two Model s trained on the same DataBlob and ModelTemplate have the same name.

property datablob(self) Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob]

Returns the DataBlob or the DistributedDataBlob used to create this model.

property model_template(self) scalarstop.model_template.ModelTemplate

Returns the ModelTemplate used to create this model.

property model(self) Any

The model object from the underlying machine learning framework.

static load(model_path: str, epoch_num: Optional[int] = None) Any

Loads a model.

Parameters
  • model_path – The filesystem directory for this specific model. (e.g. models_directory/model_name)

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Raises

ModelNotFoundError – Raised when a saved copy of the model cannot be found at the given directory. If you are overriding this method, you should make sure to catch any exceptions your code generates, such as FileNotFoundError, and re-reraise them as ModelNotFoundError.

property history(self) Mapping[str, Sequence[float]]

Returns the per-epoch history for training and validation metrics.

property current_epoch(self) int

Returns how many epochs the current model has been trained.

save(self, models_directory: str) None

Saves a model to the given directory.

fit(self, *, final_epoch: int, **kwargs) Mapping[str, Sequence[float]]

Fits the given model to the given DataBlob.

predict(self, dataset: tf.data.Dataset)

Runs predictions with the dataset on the model.

evaluate(self, dataset: Optional[tf.data.Dataset] = None) Sequence[float]

Evaluate the model on a dataset.

class KerasModel(*, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, model: Optional[Any] = None, history: Optional[_KERAS_HISTORY_TYPE] = None)

Bases: Model

Trains tf.keras machine learning models generated by a ModelTemplate on the training and validation sets in a DataBlob.

classmethod from_filesystem(cls, *, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, models_directory: str, epoch_num: Optional[int] = None) KerasModel

Load an already-trained model from the filesystem.

Parameters
  • datablob – The DataBlob or DistributedDataBlob used to train the model that we are looking for.

  • model_template – The ModelTemplate used to create the model that we are looking for.

  • models_directory – The directory where you store all of your pretrained models. This is the parent directory of a single pretrained model.

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Returns

A Model with weights and configuration from

the filesystem.

Raises

ModelNotFoundError – Raised when we cannot find the model. If you intend on subclassing from_filesystem(), make sure to raise this exception when you cannot find the model.

property history(self) Mapping[str, Sequence[float]]

Returns the history for the Keras model.

property current_epoch(self) int

Returns how many epochs the current model has been trained.

save(self, models_directory: str) None

Saves a model to the given directory.

fit(self, *, final_epoch: int, verbose: Optional[int] = None, models_directory: Optional[str] = None, log_batches: bool = False, log_epochs: bool = False, logger: Optional[Any] = None, train_store: Optional[scalarstop.train_store.TrainStore] = None, tensorboard_directory: Optional[str] = None, profile_batch: Union[int, Tuple[int, int]] = 0, steps_per_epoch: Optional[int] = None, validation_steps_per_epoch: Optional[int] = None, callbacks: Optional[Sequence[tf.keras.callbacks.Callback]] = None, **kwargs) Mapping[str, Sequence[float]]

Fit the Keras model to the DataBlob that this model was created for.

Parameters
  • final_epoch – The epoch number to train to. If the model has already been trained for final_epoch or more epochs, then this function will do nothing. This helps make training a machine learning model into an idempotent operation.

  • verbose – The verbosity to level to use.

  • models_directory – The directory to save this machine learning model every epoch.

  • log_batches – Emit a Python logging message as an INFO level log at the end of every single training batch.

  • log_epochs – Emit a Python logging message as an INFO level log at the end of every single training epoch.

  • logger – A custom Python logger to log epochs with, to be used if log_batches and/or log_epochs are True.

  • train_store – A TrainStore instance, which is a client that persists metadata about DataBlob s, ModelTemplate s, and Model s.

  • tensorboard_directory – A directory on the filesystem to write TensorBoard data.

  • profile_batch – A batch number or a tuple of batch numbers to profile. This is only valid when a valid filesystem path is given as tensorboard_directory.

  • steps_per_epoch – The total number of steps (batches of samples) before declaring one training epoch as “finished” and starting the next epoch. When steps_per_epoch is None, the epoch will run until the input DataBlob.training is exhausted. When passing an infinitely repeating dataset, you must specify steps_per_epoch. If steps_per_epoch = -1, the training will run indefinitely with an infinitely repeating dataset.

  • validation_steps_per_epoch – The total number of steps (batches of samples) before declaring one validation epoch as “finished” and starting the next epoch. If validation_steps_per_epoch is specified and only part of DataBlob.validation is consumed, the evaluation of DataBlob.validation will start from the beginning of DataBlob.validation at every epoch. This ensures that the same validation samples are used every time.

  • callbacks – A list of Keras callbacks to use while training.

predict(self, dataset: tf.data.Dataset, verbose: Optional[int] = None, callbacks: Optional[Sequence[tf.keras.callbacks.Callback]] = None) numpy.ndarray

Use the model to generate predictions on this dataset.

Parameters
  • dataset – An input dataset to predict on. This accepts any type type that tf.keras.Model can generate predictions for.

  • verbose – Verbosity level for predictions.

  • callbacks – A list of Keras callbacks to use while making predictions.

evaluate(self, dataset: Optional[tf.data.Dataset] = None, verbose: Optional[int] = None, callbacks: Optional[Sequence[tf.keras.callbacks.Callback]] = None) Sequence[float]

Evaluate this model on the DataBlob’s test set.

Optionally, you can provide another tf.data.Dataset via the dataset parameter.

Parameters
  • dataset – Another tf.data.Dataset to evalaute instead of the test set of the provided DataBlob.

  • verbose – Specifiy verbosity for evaluating this model.

  • callbacks – A list of Keras callbacks to use when evaluating the model.

classmethod from_filesystem_or_new(cls, *, datablob: Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob], model_template: scalarstop.model_template.ModelTemplate, models_directory: str, epoch_num: Optional[int] = None) Model

Load a saved model from the filesystem. If we can’t find one, create a new one with the supplied ModelTemplate.

Parameters
  • datablob – The DataBlob or DistributedDataBlob that we will use to train the model.

  • model_template – The ModelTemplate that we will use to create the model.

  • models_directory – The directory where you store all of your pretrained models. This is the parent directory of a single pretrained model.

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Returns

A Model instance.

static calculate_name(model_template_name: str, datablob_name: str) str

Create a model name from a ModelTemplate name and a DataBlob name.

property name(self) str

This model’s name.

If you intend on overriding this method, you should make sure that two Model s trained on the same DataBlob and ModelTemplate have the same name.

property datablob(self) Union[scalarstop.datablob.DataBlob, scalarstop.datablob.DistributedDataBlob]

Returns the DataBlob or the DistributedDataBlob used to create this model.

property model_template(self) scalarstop.model_template.ModelTemplate

Returns the ModelTemplate used to create this model.

property model(self) Any

The model object from the underlying machine learning framework.

static load(model_path: str, epoch_num: Optional[int] = None) Any

Loads a model.

Parameters
  • model_path – The filesystem directory for this specific model. (e.g. models_directory/model_name)

  • epoch_num – The saved epoch number to load. By default, we load the latest epoch.

Raises

ModelNotFoundError – Raised when a saved copy of the model cannot be found at the given directory. If you are overriding this method, you should make sure to catch any exceptions your code generates, such as FileNotFoundError, and re-reraise them as ModelNotFoundError.