scalarstop._tfdata

Internal code for saving and loading tf.data.Dataset pipelines.

Module Contents

Functions

make_num_shards_on_save(num_shards: int) → Callable

Generates a sharding function for :py:func`tf.data.experimental.save`.

tfdata_load(path: str, save_load_version, total_num_shards: int = 1, element_spec=None, shard_offset: Optional[int] = None, shard_quantity: int = 1) → tensorflow.data.Dataset

Load a tf.data.Dataset from a filesystem path.

tfdata_save(dataset: tensorflow.data.Dataset, path: str, num_shards: int, save_load_version: int)

Save a tf.data dataset.

make_num_shards_on_save(num_shards: int) Callable

Generates a sharding function for :py:func`tf.data.experimental.save`.

Parameters

num_shards – The number of distinct files to save to the filesystem.

Returns

Returns a function that accepts an enumerated tf.data.Dataset and returns the enumerated index modulo num_shards.

tfdata_load(path: str, save_load_version, total_num_shards: int = 1, element_spec=None, shard_offset: Optional[int] = None, shard_quantity: int = 1) tf.data.Dataset

Load a tf.data.Dataset from a filesystem path.

This is a little different from tf.data.experimental.load() because we save the element_spec in a pickled file above the tf.data.Dataset ‘s directory.

If you want to read a dataset that doesn’t have the element_spec saved on disk, then just specify the element_spec keyword argument with your own value.

tfdata_save(dataset: tf.data.Dataset, path: str, num_shards: int, save_load_version: int)

Save a tf.data dataset.