kd.data.InMemoryPipeline#
- class kauldron.data.InMemoryPipeline(*, _fake_refs: type[_FakeRefsUnset] | dict[str, _FakeRootCfg] = <class 'kauldron.utils.config_util._FakeRefsUnset'>, batch_size: int | None = None, seed: int | collections.abc.Sequence[int] | numpy.ndarray | jaxtyping.UInt32[Array, '2'] | jaxtyping.UInt32[ndarray, '2'] | jax.Array | None = _FakeRootCfg('cfg.seed'), loader: Callable[[], _ArrayTree], shuffle: bool = False, num_epochs: Optional[int] = None, drop_remainder: bool = True)[source]
Bases:
kauldron.data.pipelines.PipelinePipeline which fit in memory.
- loader
Callable which returns all examples in a single Tree[Array[‘num_examples …’]] of `np.array
- Type:
collections.abc.Callable[[], Any]
- shuffle
Whether to shuffle the dataset
- Type:
bool
- num_epochs
Number of epoch (None for infinite iteration)
- Type:
int | None
- drop_remainder
Whether to drop the remainer (currently drop_remainder=False not supported)
- Type:
bool
- loader: Callable[[], _ArrayTree]
- shuffle: bool = False
- num_epochs: int | None = None
- drop_remainder: bool = True
- iter() collections.abc.Iterator[Any][source]
Iterator.
- property examples: Any
Cached in-memory data.
- property num_examples: int
- property sampler: kauldron.data.in_memory.BatchedIndexSampler