vision_unlearning.datasets.base =============================== .. py:module:: vision_unlearning.datasets.base Attributes ---------- .. autoapisummary:: vision_unlearning.datasets.base.logger Exceptions ---------- .. autoapisummary:: vision_unlearning.datasets.base.SplitNotAvailableError Classes ------- .. autoapisummary:: vision_unlearning.datasets.base.UnlearnDatasetSplit vision_unlearning.datasets.base.UnlearnDatasetSplitMode vision_unlearning.datasets.base.UnlearnDataset Module Contents --------------- .. py:data:: logger .. py:class:: UnlearnDatasetSplit Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: Train :value: 'train' .. py:attribute:: Validation :value: 'validation' .. py:attribute:: Test :value: 'test' .. py:attribute:: Train_retain :value: 'train_retain' .. py:attribute:: Train_retain_MIA :value: 'train_retain_mia' .. py:attribute:: Train_forget :value: 'train_forget' .. py:attribute:: Test_retain :value: 'test_retain' .. py:attribute:: Test_forget :value: 'test_forget' .. py:attribute:: Validation_retain :value: 'validation_retain' .. py:attribute:: Validation_forget :value: 'validation_forget' .. py:class:: UnlearnDatasetSplitMode Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: Class :value: 'class' .. py:attribute:: Random :value: 'random' .. py:attribute:: Temporal :value: 'temporal' .. py:exception:: SplitNotAvailableError Bases: :py:obj:`Exception` Common base class for all non-exit exceptions. .. py:class:: UnlearnDataset(/, **data: Any) Bases: :py:obj:`pydantic.BaseModel`, :py:obj:`abc.ABC` Wrapper around huggingface datasets Organize the forget-retain splits .. py:attribute:: model_config Configuration for the model, should be a dictionary conforming to [`ConfigDict`][pydantic.config.ConfigDict]. .. py:attribute:: split_mode :type: UnlearnDatasetSplitMode .. py:attribute:: split_kwargs :type: dict .. py:attribute:: _dataset_splits :type: Dict[UnlearnDatasetSplit, Union[torch.utils.data.Subset, torchvision.datasets.vision.VisionDataset]] .. py:attribute:: _classes :type: Optional[List[str]] :value: None .. py:attribute:: _n_classes :type: int :value: 0 .. py:attribute:: mean :type: Optional[Sequence[float]] :value: None .. py:attribute:: std :type: Optional[Sequence[float]] :value: None .. py:method:: model_post_init(__context: Optional[dict]) -> None Override this method to perform additional initialization after `__init__` and `model_construct`. This is useful if you want to do some validation that requires the entire model to be initialized. .. py:method:: _load() -> None :abstractmethod: Load the dataset from disk or download it. Side effects: updates the properties _dataset_splits, _classes, _n_classes .. py:method:: _split() -> None Split the dataset based on the specified mode. Side effects: updates the property dataset_splits Raised exceptions: none .. py:method:: _split_class(forget: List[str | int] | str | int) -> None .. py:method:: _split_random(n_forget: int, seed: int = 42) -> None :abstractmethod: .. py:method:: _split_temporal(n_forget: int) -> None :abstractmethod: .. py:method:: get_loader(split: UnlearnDatasetSplit, batchsize: int, shuffle: bool = True, num_workers: int = 0, pin_memory: bool = True) -> Optional[torch.utils.data.DataLoader] Return this split for this dataset. Side effects: none Raised exceptions: SplitNotAvailableError, if the requested split is not available .. py:method:: get_splits() -> Dict[UnlearnDatasetSplit, Union[torch.utils.data.Subset, torchvision.datasets.vision.VisionDataset]] Return the available splits. Side effects: none Raised exceptions: none .. py:method:: denormalize(normalized: torch.Tensor) -> torch.Tensor .. py:method:: save(path: str, format: Literal['pkl', 'jpg'] = 'pkl', save_unsplit: bool = False) -> None Save each split to disk. Side effects: saves files to disk Raised exceptions: OS-related errors .. py:method:: make_prompt_for_label(label: int) -> str