vision_unlearning.datasets.testbed ================================== .. py:module:: vision_unlearning.datasets.testbed Attributes ---------- .. autoapisummary:: vision_unlearning.datasets.testbed.logger vision_unlearning.datasets.testbed.task_to_dataset_map Functions --------- .. autoapisummary:: vision_unlearning.datasets.testbed.get_target_preprocessed vision_unlearning.datasets.testbed.get_target_overwrite vision_unlearning.datasets.testbed.get_metadata_filtered_path vision_unlearning.datasets.testbed.get_metadata_filtered vision_unlearning.datasets.testbed.save_metadata_filtered vision_unlearning.datasets.testbed.exists_metadata_filtered vision_unlearning.datasets.testbed.get_attribute_for_entity vision_unlearning.datasets.testbed.get_unlearned_model_folder vision_unlearning.datasets.testbed.exists_unlearned_model vision_unlearning.datasets.testbed.get_generated_dataset_folder vision_unlearning.datasets.testbed.get_generated_dataset_file vision_unlearning.datasets.testbed.exists_unlearned_dataset vision_unlearning.datasets.testbed.get_baseline_dataset_folder vision_unlearning.datasets.testbed.get_off_image_path vision_unlearning.datasets.testbed.get_similarity_clip_path vision_unlearning.datasets.testbed.get_similarity_clip_df vision_unlearning.datasets.testbed.calculate_similarity_clip vision_unlearning.datasets.testbed.plot_heatmap Module Contents --------------- .. py:data:: logger .. py:function:: get_target_preprocessed(task: Literal['scenes', 'objects', 'breeds', 'people'], target: str) -> str .. py:function:: get_target_overwrite(task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], target: str) -> Tuple[str, str] @return preprocessed target, target_overwrite .. py:function:: get_metadata_filtered_path(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> str .. py:function:: get_metadata_filtered(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> List[Dict[str, Any]] .. py:function:: save_metadata_filtered(task: Literal['scenes', 'objects', 'breeds', 'people'], metadata_filtered: List[Dict[str, Any]], base_folder: str = 'assets') .. py:function:: exists_metadata_filtered(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> bool .. py:function:: get_attribute_for_entity(metadata_filtered: List[Dict[str, Any]], entity_name: str, attribute: str) -> Any .. py:data:: task_to_dataset_map :type: Dict[Literal['scenes', 'objects', 'breeds', 'people'], str] .. py:function:: get_unlearned_model_folder(task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, target: str, base_folder: str = 'assets') -> str .. py:function:: exists_unlearned_model(task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, target: str, base_folder: str = 'assets') -> bool .. py:function:: get_generated_dataset_folder(task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, target: str, base_folder: str = 'assets') -> str .. py:function:: get_generated_dataset_file(lora_state: Literal['on', 'off'], seed: int, prompt: str) -> str .. py:function:: exists_unlearned_dataset(generated_dataset_output_path: str, generate_dataset_seeds: List[int], prompts: List[str]) -> bool Return True if the entity dataset folder contains all expected on_* images. Entity folders now contain only lora_state='on' (unlearned model) images. Baseline lora_state='off' images live in the separate baseline folder; see get_baseline_dataset_folder() and get_off_image_path(). Expected file count: len(seeds) * len(prompts) on_*.png files + 1 metadata.jsonl. .. py:function:: get_baseline_dataset_folder(task: Literal['scenes', 'objects', 'breeds', 'people'], target: str, base_folder: str = 'assets') -> str Return the folder path for method-agnostic baseline (lora_state='off') images. Baseline images are generated once per entity by 0_generate_dataset_original.py and shared across all methods. The folder is distinct from the per-method entity folder returned by get_generated_dataset_folder(). Convention: assets/datasets/generated_{task}_baseline_{target}/ .. py:function:: get_off_image_path(task: Literal['scenes', 'objects', 'breeds', 'people'], target: str, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, seed: int, prompt: str, base_folder: str = 'assets') -> str Return the path to a baseline (lora_state='off') image for a given entity/seed/prompt. Encapsulates backward-compatibility fallback logic in a single place: 1. If the baseline folder (get_baseline_dataset_folder) exists on disk, use it. 2. Otherwise fall back to the old entity folder (get_generated_dataset_folder), which was the pre-refactor location for both on_* and off_* images. This means existing datasets that contain off_* files in the entity folder continue to work transparently until baseline folders are generated. .. py:function:: get_similarity_clip_path(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> str .. py:function:: get_similarity_clip_df(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> pandas.DataFrame .. py:function:: calculate_similarity_clip(task: Literal['scenes', 'objects', 'breeds', 'people'], labels: List[str], base_folder: str = 'assets') -> None .. py:function:: plot_heatmap(df, figsize=None, cmap='viridis', title='Heatmap') Plot a heatmap for a square DataFrame with all labels visible. :param df: A square DataFrame with same string labels for index and columns. :type df: pd.DataFrame :param figsize: Figure size (width, height). Increase if labels overlap. :type figsize: tuple :param cmap: Colormap name for matplotlib. :type cmap: str