vision_unlearning.datasets.base

Attributes

logger

Exceptions

SplitNotAvailableError

Common base class for all non-exit exceptions.

Classes

UnlearnDatasetSplit

Generic enumeration.

UnlearnDatasetSplitMode

Generic enumeration.

UnlearnDataset

Wrapper around huggingface datasets

Module Contents

vision_unlearning.datasets.base.logger
class vision_unlearning.datasets.base.UnlearnDatasetSplit[source]

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

Train = 'train'
Validation = 'validation'
Test = 'test'
Train_retain = 'train_retain'
Train_retain_MIA = 'train_retain_mia'
Train_forget = 'train_forget'
Test_retain = 'test_retain'
Test_forget = 'test_forget'
Validation_retain = 'validation_retain'
Validation_forget = 'validation_forget'
class vision_unlearning.datasets.base.UnlearnDatasetSplitMode[source]

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

Class = 'class'
Random = 'random'
Temporal = 'temporal'
exception vision_unlearning.datasets.base.SplitNotAvailableError[source]

Bases: Exception

Common base class for all non-exit exceptions.

class vision_unlearning.datasets.base.UnlearnDataset(/, **data: Any)[source]

Bases: pydantic.BaseModel, abc.ABC

Wrapper around huggingface datasets Organize the forget-retain splits

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

split_mode: UnlearnDatasetSplitMode
split_kwargs: dict
_dataset_splits: Dict[UnlearnDatasetSplit, torch.utils.data.Subset | torchvision.datasets.vision.VisionDataset]
_classes: List[str] | None = None
_n_classes: int = 0
mean: Sequence[float] | None = None
std: Sequence[float] | None = None
model_post_init(__context: dict | None) None[source]

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

abstract _load() None[source]

Load the dataset from disk or download it. Side effects: updates the properties _dataset_splits, _classes, _n_classes

_split() None[source]

Split the dataset based on the specified mode. Side effects: updates the property dataset_splits Raised exceptions: none

_split_class(forget: List[str | int] | str | int) None[source]
abstract _split_random(n_forget: int, seed: int = 42) None[source]
abstract _split_temporal(n_forget: int) None[source]
get_loader(split: UnlearnDatasetSplit, batchsize: int, shuffle: bool = True, num_workers: int = 0, pin_memory: bool = True) torch.utils.data.DataLoader | None[source]

Return this split for this dataset. Side effects: none Raised exceptions: SplitNotAvailableError, if the requested split is not available

get_splits() Dict[UnlearnDatasetSplit, torch.utils.data.Subset | torchvision.datasets.vision.VisionDataset][source]

Return the available splits. Side effects: none Raised exceptions: none

denormalize(normalized: torch.Tensor) torch.Tensor[source]
save(path: str, format: Literal['pkl', 'jpg'] = 'pkl', save_unsplit: bool = False) None[source]

Save each split to disk. Side effects: saves files to disk Raised exceptions: OS-related errors

make_prompt_for_label(label: int) str[source]