kd.data.IterableDataset

kd.data.IterableDataset#

class kauldron.data.IterableDataset[source]

Bases: abc.ABC

General interface for iterable datasets.

abstract property element_spec: etree.Tree[enp.ArraySpec]

Numpy version of element-spec.

map(
map_fn: collections.abc.Callable[[...], Any],
) kauldron.data.data_utils.IterableDataset[source]
prefetch(buffer_size: int = 1)[source]

Pre-fetch the iterator (synchronously).

cache()[source]
take(
num_examples: int,
) kauldron.data.data_utils.IterableDataset[source]
device_put(
sharding: jax.sharding.NamedSharding | None = None,
) kauldron.data.data_utils.IterableDataset[source]

Copy elements onto device with a specified sharding.

Parameters:

sharding – How to shard the elements among devices. Will likely be either REPLICATED or FIRST_DIM. Defaults to FIRST_DIM.

Returns:

A _DevicePutDataset which wraps the original IterableDataset and copies all elements onto device.