vision_unlearning.benchmarks.I_care
===================================

.. py:module:: vision_unlearning.benchmarks.I_care

.. autoapi-nested-parse::

   I-CARE benchmark package.

   This package contains all code for the I-CARE unlearning benchmark:
   - configuration.py: domain constants, type aliases, GUI_TO_BACKEND mapping
   - metadata.py: interference-per-pair / interference-per-entity helpers, InterferencePerEntity
   - metrics.py: per-entity interference metric functions
   - result_templates.py: ResultTemplate base class + all subclasses + registries
   - utils.py: image encoding/decoding helpers, SHAP serialization, error classes

   All public symbols are re-exported here so callers can do:
       from vision_unlearning.benchmarks.I_care import rt_name_to_class
       from vision_unlearning.benchmarks.I_care import InterferencePerEntity
       # etc.

   Optional dependencies (seaborn, scikit-learn, shap) are used by the result templates
   and the SHAP utilities.  They are not required for basic testbed / dataset work.
   Install them via:  pip install vision-unlearning[testbed]


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/vision_unlearning/benchmarks/I_care/configuration/index
   /autoapi/vision_unlearning/benchmarks/I_care/embeddings/index
   /autoapi/vision_unlearning/benchmarks/I_care/metadata/index
   /autoapi/vision_unlearning/benchmarks/I_care/metrics/index
   /autoapi/vision_unlearning/benchmarks/I_care/result_templates/index
   /autoapi/vision_unlearning/benchmarks/I_care/utils/index


Attributes
----------

.. autoapisummary::

   vision_unlearning.benchmarks.I_care.domain_unlearning_algorithm
   vision_unlearning.benchmarks.I_care.domain_task
   vision_unlearning.benchmarks.I_care.domain_attribute
   vision_unlearning.benchmarks.I_care.domain_entity
   vision_unlearning.benchmarks.I_care.domain_model
   vision_unlearning.benchmarks.I_care.domain_mp
   vision_unlearning.benchmarks.I_care.domain_me
   vision_unlearning.benchmarks.I_care.domain_s
   vision_unlearning.benchmarks.I_care.domain_l
   vision_unlearning.benchmarks.I_care.type_task
   vision_unlearning.benchmarks.I_care.type_unlearning_algorithm
   vision_unlearning.benchmarks.I_care.type_model
   vision_unlearning.benchmarks.I_care.type_mp
   vision_unlearning.benchmarks.I_care.type_me
   vision_unlearning.benchmarks.I_care.type_s
   vision_unlearning.benchmarks.I_care.type_l
   vision_unlearning.benchmarks.I_care.GUI_TO_BACKEND
   vision_unlearning.benchmarks.I_care.unlearning_algorithm_to_epochs
   vision_unlearning.benchmarks.I_care.s_to_direction
   vision_unlearning.benchmarks.I_care.EMBEDDING_MODEL
   vision_unlearning.benchmarks.I_care.EMBEDDING_DIM
   vision_unlearning.benchmarks.I_care.rt_name_to_class
   vision_unlearning.benchmarks.I_care.rt_name_to_params


Exceptions
----------

.. autoapisummary::

   vision_unlearning.benchmarks.I_care.InvalidAttributeTypeError
   vision_unlearning.benchmarks.I_care.InsufficientSamplesError


Classes
-------

.. autoapisummary::

   vision_unlearning.benchmarks.I_care.InterferencePerEntity
   vision_unlearning.benchmarks.I_care.ResultTemplate
   vision_unlearning.benchmarks.I_care.ResultTemplateMetricMetricAlignment
   vision_unlearning.benchmarks.I_care.ResultTemplateMetricSimilarityAlignment
   vision_unlearning.benchmarks.I_care.ResultTemplateMetricSimilarityAlignmentMulti
   vision_unlearning.benchmarks.I_care.ResultTemplateSignificantRelationshipNumerical
   vision_unlearning.benchmarks.I_care.ResultTemplateSignificantRelationshipCategorical
   vision_unlearning.benchmarks.I_care.ResultTemplateCountSignificantRelationship
   vision_unlearning.benchmarks.I_care.ResultTemplateImplicitAssociationTest
   vision_unlearning.benchmarks.I_care.ResultTemplateMinimumCutInterference
   vision_unlearning.benchmarks.I_care.ResultTemplateUnlearningVisualSummary
   vision_unlearning.benchmarks.I_care.ResultTemplateInterferenceVisualSummary
   vision_unlearning.benchmarks.I_care.ResultTemplateMatrix
   vision_unlearning.benchmarks.I_care.ResultTemplateInterferenceMatrix
   vision_unlearning.benchmarks.I_care.ResultTemplateSimilarityMatrix
   vision_unlearning.benchmarks.I_care.ResultTemplateMethodComparisonByMetricEntity
   vision_unlearning.benchmarks.I_care.ResultTemplateEmbeddingUnlearningProfile
   vision_unlearning.benchmarks.I_care.ResultTemplateEmbeddingForgettingEfficiency


Functions
---------

.. autoapisummary::

   vision_unlearning.benchmarks.I_care.convert_params_from_gui_to_backend
   vision_unlearning.benchmarks.I_care.get_interference_per_pair_path
   vision_unlearning.benchmarks.I_care.get_interference_per_pair
   vision_unlearning.benchmarks.I_care.exists_interference_per_pair
   vision_unlearning.benchmarks.I_care.save_interference_per_pair
   vision_unlearning.benchmarks.I_care.get_interference_per_pair_inverse
   vision_unlearning.benchmarks.I_care.get_interference_per_entity_path
   vision_unlearning.benchmarks.I_care.get_interference_per_entity
   vision_unlearning.benchmarks.I_care.save_interference_per_entity
   vision_unlearning.benchmarks.I_care.choose_metric_column_interference_per_entity
   vision_unlearning.benchmarks.I_care.get_metadata_filtered
   vision_unlearning.benchmarks.I_care.get_metadata_filtered_path
   vision_unlearning.benchmarks.I_care.get_target_overwrite
   vision_unlearning.benchmarks.I_care.get_generated_dataset_file
   vision_unlearning.benchmarks.I_care.find_worst_interfered
   vision_unlearning.benchmarks.I_care.metric_of_worst_interfered
   vision_unlearning.benchmarks.I_care.is_worst_interfered_target
   vision_unlearning.benchmarks.I_care.number_of_interfered_worse_than_target
   vision_unlearning.benchmarks.I_care.number_of_interfered_worse_than_threshold
   vision_unlearning.benchmarks.I_care.average_metric
   vision_unlearning.benchmarks.I_care._encode_image_file
   vision_unlearning.benchmarks.I_care._decode_image
   vision_unlearning.benchmarks.I_care.explanation_to_dict
   vision_unlearning.benchmarks.I_care.dict_to_explanation
   vision_unlearning.benchmarks.I_care.load_dino_model
   vision_unlearning.benchmarks.I_care.embed_image_with_dino
   vision_unlearning.benchmarks.I_care.embed_forgetting_session
   vision_unlearning.benchmarks.I_care.embed_forgetting_session_batched
   vision_unlearning.benchmarks.I_care.huggingface_dataset_file_exists
   vision_unlearning.benchmarks.I_care.huggingface_dataset_file_download
   vision_unlearning.benchmarks.I_care.huggingface_dataset_upload
   vision_unlearning.benchmarks.I_care.huggingface_dataset_file_upload
   vision_unlearning.benchmarks.I_care.huggingface_dataset_download
   vision_unlearning.benchmarks.I_care.jacc_metric_score
   vision_unlearning.benchmarks.I_care.display_interesting_interferences
   vision_unlearning.benchmarks.I_care.analyze_relationship_regression
   vision_unlearning.benchmarks.I_care.analyze_relationship_category
   vision_unlearning.benchmarks.I_care.analyze_relationship_numerical
   vision_unlearning.benchmarks.I_care.analyze_relationship_categorical
   vision_unlearning.benchmarks.I_care.analyze_correlation_between_pairwise_metrics
   vision_unlearning.benchmarks.I_care.check_eval_results


Package Contents
----------------

.. py:data:: domain_unlearning_algorithm
   :value: ['FADE', 'Munba', 'UCE']


.. py:data:: domain_task
   :value: ['Breeds', 'Scenes', 'People']


.. py:data:: domain_attribute

.. py:data:: domain_entity

.. py:data:: domain_model
   :value: ['Stable Diffusion 1.4']


.. py:data:: domain_mp
   :value: ['Delta Clip', 'Delta Brisque', 'RMSE', 'SSIM']


.. py:data:: domain_me
   :value: ['Emitter worst interfered brisque diff', 'Emitter worst interfered clip diff', 'Emitter worst...


.. py:data:: domain_s
   :value: ['Clip Cosine Similarity', 'Jacc Similarity']


.. py:data:: domain_l
   :value: ['Clip Embedding']


.. py:data:: type_task

.. py:data:: type_unlearning_algorithm

.. py:data:: type_model

.. py:data:: type_mp

.. py:data:: type_me

.. py:data:: type_s

.. py:data:: type_l

.. py:data:: GUI_TO_BACKEND

.. py:data:: unlearning_algorithm_to_epochs

.. py:function:: convert_params_from_gui_to_backend(params: Dict[str, Any]) -> Dict[str, Any]

   Convert GUI values to backend literal values.
   Unknown keys are passed through unchanged.
   None stays None.


.. py:data:: s_to_direction
   :type:  Dict[type_s, type_direction]

.. py:function:: get_interference_per_pair_path(task: Literal['scenes', 'objects', 'breeds', 'people'], index: int, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, base_folder: str = 'assets') -> str

.. py:function:: get_interference_per_pair(task: Literal['scenes', 'objects', 'breeds', 'people'], index: int, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, max_identities: int = 100, base_folder: str = 'assets') -> Dict[str, Dict[str, float]]

.. py:function:: exists_interference_per_pair(task: Literal['scenes', 'objects', 'breeds', 'people'], index: int, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, base_folder: str = 'assets') -> bool

.. py:function:: save_interference_per_pair(interference_per_pair: Dict[str, Dict[str, float]], task: Literal['scenes', 'objects', 'breeds', 'people'], index: int, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, base_folder: str = 'assets') -> None

.. py:function:: get_interference_per_pair_inverse(task: Literal['scenes', 'objects', 'breeds', 'people'], index: int, method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, index_start: int = 0, max_identities: int = 100, base_folder: str = 'assets') -> Dict[str, Dict[str, float]]

.. py:function:: get_interference_per_entity_path(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> str

.. py:function:: get_interference_per_entity(task: Literal['scenes', 'objects', 'breeds', 'people'], max_identities: int = 100, base_folder: str = 'assets') -> List[Dict[str, Any]]

.. py:function:: save_interference_per_entity(task: Literal['scenes', 'objects', 'breeds', 'people'], metadata_filtered: List[Dict[str, Any]], base_folder: str = 'assets') -> None

.. py:class:: InterferencePerEntity(/, **data: Any)

   Bases: :py:obj:`pydantic.BaseModel`


   !!! abstract "Usage Documentation"
       [Models](../concepts/models.md)

   A base class for creating Pydantic models.

   .. attribute:: __class_vars__

      The names of the class variables defined on the model.

   .. attribute:: __private_attributes__

      Metadata about the private attributes of the model.

   .. attribute:: __signature__

      The synthesized `__init__` [`Signature`][inspect.Signature] of the model.

   .. attribute:: __pydantic_complete__

      Whether model building is completed, or if there are still undefined fields.

   .. attribute:: __pydantic_core_schema__

      The core schema of the model.

   .. attribute:: __pydantic_custom_init__

      Whether the model has a custom `__init__` function.

   .. attribute:: __pydantic_decorators__

      Metadata containing the decorators defined on the model.
      This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.

   .. attribute:: __pydantic_generic_metadata__

      A dictionary containing metadata about generic Pydantic models.
      The `origin` and `args` items map to the [`__origin__`][genericalias.__origin__]
      and [`__args__`][genericalias.__args__] attributes of [generic aliases][types-genericalias],
      and the `parameter` item maps to the `__parameter__` attribute of generic classes.

   .. attribute:: __pydantic_parent_namespace__

      Parent namespace of the model, used for automatic rebuilding of models.

   .. attribute:: __pydantic_post_init__

      The name of the post-init method for the model, if defined.

   .. attribute:: __pydantic_root_model__

      Whether the model is a [`RootModel`][pydantic.root_model.RootModel].

   .. attribute:: __pydantic_serializer__

      The `pydantic-core` `SchemaSerializer` used to dump instances of the model.

   .. attribute:: __pydantic_validator__

      The `pydantic-core` `SchemaValidator` used to validate instances of the model.

   .. attribute:: __pydantic_fields__

      A dictionary of field names and their corresponding [`FieldInfo`][pydantic.fields.FieldInfo] objects.

   .. attribute:: __pydantic_computed_fields__

      A dictionary of computed field names and their corresponding [`ComputedFieldInfo`][pydantic.fields.ComputedFieldInfo] objects.

   .. attribute:: __pydantic_extra__

      A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
      is set to `'allow'`.

   .. attribute:: __pydantic_fields_set__

      The names of fields explicitly set during instantiation.

   .. attribute:: __pydantic_private__

      Values of private attributes set on the model instance.


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: base_folder
      :type:  str
      :value: 'assets'


   .. py:attribute:: remote_repository_name
      :type:  str
      :value: 'LeonardoBenitez/VisionUnlearningEvaluationTestbeds'


   .. py:attribute:: save_outputs
      :type:  bool
      :value: True


   .. py:attribute:: recompute_if_exists
      :type:  bool
      :value: False


   .. py:attribute:: upload_if_recomputed
      :type:  bool
      :value: False


   .. py:method:: _get_data_path_remote() -> str


   .. py:method:: _get_data_path_local() -> str


   .. py:method:: _compute_from_scratch() -> List[Dict[str, Any]]
      :abstractmethod:


   .. py:method:: compute() -> List[Dict[str, Any]]


.. py:function:: choose_metric_column_interference_per_entity(unlearning_algorithm: vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm, interference_entity: vision_unlearning.benchmarks.I_care.configuration.type_me, metric_cols: List[str]) -> str

   The columns of the interference per entity file are not named in a way that is easy to generate given `unlearning_algorithm` and `interference_entity`, so we need to search for the right one.
   We assume there is only one match, and we assert it. If there are no matches or more than one match, we raise an error.

   The names look like this:
       'metric_distil_400_emitter_minus_receiver_worst_interfered_ssim (↓)',
      'metric_distil_400_emitter_minus_receiver_number_of_interfered_worse_than_target_brisque_diff (↓)',
      'metric_distil_400_emitter_minus_receiver_number_of_interfered_worse_than_target_clip_diff (↓)',
      'metric_distil_400_emitter_minus_receiver_number_of_interfered_worse_than_target_rmse (↓)',
      'metric_distil_400_emitter_minus_receiver_number_of_interfered_worse_than_target_ssim (↓)',
      'metric_distil_400_emitter_minus_receiver_number_of_interfered_worse_than_zero_clip_diff (↓)',
      'metric_distil_400_emitter_minus_receiver_average_brisque_diff (↓)',
      'metric_distil_400_emitter_minus_receiver_average_clip_diff (↑)',
      'metric_uce_000_emitter_minus_receiver_average_rmse (↓)',
      'metric_munba_100_emitter_minus_receiver_average_ssim (↑)',

   TODO: these names are defined in `4. Compute interference per entity.ipynb`. There should be a central way of defining them.


.. py:function:: get_metadata_filtered(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> List[Dict[str, Any]]

.. py:function:: get_metadata_filtered_path(task: Literal['scenes', 'objects', 'breeds', 'people'], base_folder: str = 'assets') -> str

.. py:function:: get_target_overwrite(task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], target: str) -> Tuple[str, str]

   @return preprocessed target, target_overwrite


.. py:function:: get_generated_dataset_file(lora_state: Literal['on', 'off'], seed: int, prompt: str) -> str

.. py:function:: find_worst_interfered(interference_per_pair: dict, metric: str, is_worst_biggest: bool) -> Tuple[str, float]

.. py:function:: metric_of_worst_interfered(interference_per_pair: dict, metric: str, is_worst_biggest: bool) -> float

.. py:function:: is_worst_interfered_target(interference_per_pair: dict, metric: str, is_worst_biggest: bool, target: str) -> bool

.. py:function:: number_of_interfered_worse_than_target(interference_per_pair: dict, metric: str, is_worst_biggest: bool, target: str) -> int

.. py:function:: number_of_interfered_worse_than_threshold(interference_per_pair: dict, metric: str, is_worst_biggest: bool, threshold: float) -> int

.. py:function:: average_metric(interference_per_pair: dict, metric: str) -> float

.. py:function:: _encode_image_file(img_path: str, max_dim: int = 1024) -> str

   Downsample / reduce resolution to limit size before encoding


.. py:function:: _decode_image(image_data: str) -> io.BytesIO

.. py:function:: explanation_to_dict(expl: Any) -> Dict[str, Any]

   Serialize a shap.Explanation to a plain dict (JSON-serializable).


.. py:function:: dict_to_explanation(d: Dict[str, Any]) -> Any

   Deserialize a plain dict back to a shap.Explanation.

   Requires 'shap' package (optional dependency — install with pip install shap
   or pip install vision-unlearning[testbed]).


.. py:exception:: InvalidAttributeTypeError

   Bases: :py:obj:`ValueError`


   Inappropriate argument value (of correct type).


.. py:exception:: InsufficientSamplesError

   Bases: :py:obj:`ValueError`


   Inappropriate argument value (of correct type).


.. py:data:: EMBEDDING_MODEL
   :value: 'dinov2_vits14'


.. py:data:: EMBEDDING_DIM
   :value: 384


.. py:function:: load_dino_model(model_name: str = EMBEDDING_MODEL, force_device: Optional[str] = None) -> Tuple[Any, Any, str]

   Load DINOv2 model, transform pipeline, and device.

   Heavy imports (torch, torchvision) happen here, not at module load.

   :param model_name: DINOv2 model variant (default: 'dinov2_vits14' → 384-dim CLS).
   :param force_device: If set, use this device string instead of auto-detecting.

   :returns: (model, transform, device) tuple.
             model: DINOv2 PyTorch model in eval mode, on device.
             transform: torchvision.transforms pipeline (resize → crop → normalize).
             device: device string ('cuda' or 'cpu').


.. py:function:: embed_image_with_dino(image_path: str, model: Any, transform: Any, device: str) -> List[float]

   Embed a single image using a pre-loaded DINOv2 model.

   :param image_path: Path to a PNG/JPEG image on disk.
   :param model: DINOv2 model (from load_dino_model()).
   :param transform: torchvision transform (from load_dino_model()).
   :param device: device string ('cuda' or 'cpu').

   :returns: 384-dim CLS embedding as a plain Python list of floats.
             TODO: refactor into batched DataLoader for throughput (currently single-image).


.. py:function:: embed_forgetting_session(dataset_folder: str, seeds: List[int], prompts: List[str], metadata_filtered: List[Dict[str, Any]], lora_state: Literal['on', 'off'], task: str, embed_image_fn: Optional[Callable[[str], List[float]]] = None) -> List[Dict[str, Any]]

   Embed all images from one forgetting session (entity or baseline).

   Iterates over all (seed, prompt) combinations and embeds each matching image.
   Images that do not exist on disk are skipped with a warning.

   :param dataset_folder: Local directory containing the generated images.
   :param seeds: List of generation seeds (e.g. [0, 1, 2, 3]).
   :param prompts: Full prompt strings (e.g. "An image of Colin Powell").
   :param metadata_filtered: Metadata list used to map prompt index → entity name.
                             metadata_filtered[i]['name'] corresponds to prompts[i].
   :param lora_state: 'on' for unlearned model images, 'off' for baseline images.
   :param task: Task name, passed to get_target_preprocessed().
   :param embed_image_fn: Injectable embedding function (image_path → [float]).
                          Required — there is no default. Pass embed_image_with_dino
                          (partially applied) or a test stub.

   :returns:

             [
                 {
                     'prompted_entity': str,   # entity name (preprocessed)
                     'seed': int,
                     'prompt': str,
                     'embedding': List[float], # 384-dim CLS embedding
                 },
                 ...
             ]
   :rtype: List of records


.. py:function:: embed_forgetting_session_batched(dataset_folder: str, seeds: List[int], prompts: List[str], metadata_filtered: List[Dict[str, Any]], lora_state: Literal['on', 'off'], task: str, model: Any, transform: Any, device: str, batch_size: int = 32) -> List[Dict[str, Any]]

   Embed all images for one forgetting session using batched GPU inference.

   More efficient than embed_forgetting_session() for large image sets.
   Collects all (path, metadata) pairs first, then processes in batches via
   a simple loop, amortising Python overhead and maximising GPU utilisation.

   :param dataset_folder: Local directory containing the generated images.
   :param seeds: List of generation seeds used.
   :param prompts: Full prompt strings.
   :param metadata_filtered: Metadata list: metadata_filtered[i]['name'] → prompts[i].
   :param lora_state: 'on' for unlearned model, 'off' for baseline.
   :param task: Task name, passed to get_target_preprocessed().
   :param model: DINOv2 model (from load_dino_model()), on device, in eval mode.
   :param transform: torchvision transform pipeline (from load_dino_model()).
   :param device: Torch device string ('cuda' or 'cpu').
   :param batch_size: Number of images per GPU forward pass (default 32).
                      TODO: tune based on VRAM; 32 images × 224×224 ≈ 220MB VRAM.

   :returns: Same structure as embed_forgetting_session().


.. py:function:: huggingface_dataset_file_exists(dataset_repository: str, dataset_path: str, token: Optional[str]) -> bool

   Checks if a specific file exists in a Hugging Face dataset repository.

   :param dataset_repository: e.g. "username/dataset_name"
   :param dataset_path: full path in repo (e.g. "config/file.jsonl")
   :param token: HF token (can be None for public repos)
   :return: True if file exists, False otherwise
   Efficiently checks if a file exists in a Hugging Face dataset repo without listing the entire repository.
   Could be done more efficiently if we use a new version of the lib, see https://chatgpt.com/share/69edd525-d008-832d-8a0c-ec4560a4fe3b


.. py:function:: huggingface_dataset_file_download(folder_datasets: str, dataset_repository: str, file_path: str, token: Optional[str], folder_cache: str = '/tmp/huggingface_cache') -> None

   Download a single file from a dataset in Hugging Face Hub.

   :param folder_datasets: Local directory where datasets are stored.
   :param dataset_repository: Hugging Face dataset repository ID
   :param file_path: Full path of the file within the repository (e.g., "config/data.jsonl")
   :param token: Hugging Face authentication token
   :param folder_cache: Cache directory for downloads

   The file will be saved at os.path.join(folder_datasets, file_path)


.. py:function:: huggingface_dataset_upload(folder_datasets: str, dataset_repository: str, dataset_config: str, token: str, path_in_repo: Optional[str] = None) -> None

   Upload a dataset folder to a HuggingFace repository.

   Supposes that a folder ``dataset_config`` exists in ``folder_datasets``,
   and that it contains the dataset files.

   :param path_in_repo: Destination path inside the HF repo.  When None (default)
                        the files land at ``dataset_config`` relative to the repo root.
                        Pass an explicit value to decouple the local folder name from its
                        location in the HF repository (e.g.,
                        ``path_in_repo="datasets/generated_breeds_baseline"``).


.. py:function:: huggingface_dataset_file_upload(file_path: str, dataset_repository: str, dataset_path: str, token: str)

   Upload a single file to a specific dataset config in Hugging Face Hub.
   @param dataset_path: full name of the file in the repository, including the config folder (e.g., "my_config/my_file.jsonl")


.. py:function:: huggingface_dataset_download(folder_datasets: str, dataset_repository: str, dataset_config: str, token: str, clean: bool = False, folder_cache: str = '/tmp/huggingface_cache', clean_cache: bool = False, path_in_repo: Optional[str] = None) -> None

   Download a dataset folder from HuggingFace.

   :param folder_datasets: Local parent directory.  The dataset is placed at
                           ``os.path.join(folder_datasets, dataset_config)``.
   :param dataset_config: Name of the local subfolder to create under
                          ``folder_datasets``.
   :param path_in_repo: Path inside the HF repository that contains the dataset
                        files.  When None (default) it is the same as ``dataset_config``.
                        Pass an explicit value when the HF-side path differs from the local
                        folder name (e.g., ``path_in_repo="datasets/generated_breeds_baseline"``).
   :param clean: If True, the local folder is deleted before downloading.


.. py:class:: ResultTemplate(/, **data: Any)

   Bases: :py:obj:`pydantic.BaseModel`


   !!! abstract "Usage Documentation"
       [Models](../concepts/models.md)

   A base class for creating Pydantic models.

   .. attribute:: __class_vars__

      The names of the class variables defined on the model.

   .. attribute:: __private_attributes__

      Metadata about the private attributes of the model.

   .. attribute:: __signature__

      The synthesized `__init__` [`Signature`][inspect.Signature] of the model.

   .. attribute:: __pydantic_complete__

      Whether model building is completed, or if there are still undefined fields.

   .. attribute:: __pydantic_core_schema__

      The core schema of the model.

   .. attribute:: __pydantic_custom_init__

      Whether the model has a custom `__init__` function.

   .. attribute:: __pydantic_decorators__

      Metadata containing the decorators defined on the model.
      This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.

   .. attribute:: __pydantic_generic_metadata__

      A dictionary containing metadata about generic Pydantic models.
      The `origin` and `args` items map to the [`__origin__`][genericalias.__origin__]
      and [`__args__`][genericalias.__args__] attributes of [generic aliases][types-genericalias],
      and the `parameter` item maps to the `__parameter__` attribute of generic classes.

   .. attribute:: __pydantic_parent_namespace__

      Parent namespace of the model, used for automatic rebuilding of models.

   .. attribute:: __pydantic_post_init__

      The name of the post-init method for the model, if defined.

   .. attribute:: __pydantic_root_model__

      Whether the model is a [`RootModel`][pydantic.root_model.RootModel].

   .. attribute:: __pydantic_serializer__

      The `pydantic-core` `SchemaSerializer` used to dump instances of the model.

   .. attribute:: __pydantic_validator__

      The `pydantic-core` `SchemaValidator` used to validate instances of the model.

   .. attribute:: __pydantic_fields__

      A dictionary of field names and their corresponding [`FieldInfo`][pydantic.fields.FieldInfo] objects.

   .. attribute:: __pydantic_computed_fields__

      A dictionary of computed field names and their corresponding [`ComputedFieldInfo`][pydantic.fields.ComputedFieldInfo] objects.

   .. attribute:: __pydantic_extra__

      A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
      is set to `'allow'`.

   .. attribute:: __pydantic_fields_set__

      The names of fields explicitly set during instantiation.

   .. attribute:: __pydantic_private__

      Values of private attributes set on the model instance.


   .. py:attribute:: recompute_if_exists
      :type:  bool
      :value: False


   .. py:attribute:: save_outputs
      :type:  bool
      :value: True


   .. py:attribute:: upload_if_recomputed
      :type:  bool
      :value: False


   .. py:attribute:: base_folder
      :type:  str
      :value: 'assets'


   .. py:attribute:: remote_repository_name
      :type:  str
      :value: 'LeonardoBenitez/VisionUnlearningEvaluationTestbeds'


   .. py:method:: _serialize_parameters() -> str
      :abstractmethod:


   .. py:method:: _get_data_path_remote() -> str


   .. py:method:: _get_data_path_local() -> str


   .. py:method:: _fig_to_bytes(fig: matplotlib.figure.Figure) -> bytes
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict | list
      :abstractmethod:


   .. py:method:: compute() -> dict


.. py:class:: ResultTemplateMetricMetricAlignment(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Measures how strongly two *MetricInterferencePerEntity* metrics are correlated.

   **Arguments:** `m`, `t`, `u`, `m_e1`, `m_e2`.
   **Result:** Pearson p-value, Spearman p-value, Pearson correlation, scatter plot.
   **Interpretation:** quantitative; the higher the correlation, the lower the need to
   calculate both metrics for this specific choice of `m`, `t`, and `u`.

   **Extended use**:
   Passing ``interference_entity_1="Forget clip diff"`` and
   ``interference_entity_2="Retain average clip diff"`` produces a forget/retain
   tradeoff scatter.  The class method :meth:`plot_multi_method` overlays results
   for several methods on one axes, enabling visual comparison of method operating
   regions (e.g. equalization verification and Pareto-style analysis).


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_entity_1
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_me


   .. py:attribute:: interference_entity_2
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_me


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 5), return_fig: bool = False, annotate_top_n: int = 5) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


      Single-method scatter with regression line.

      Top-N outliers (by absolute residual from the regression) are labelled
      with the entity name.


   .. py:method:: plot_multi_method(method_data: Dict[str, dict], figsize: Tuple[int, int] = (7, 6), return_fig: bool = False, show_means: bool = True, annotate_top_n: int = 3) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


      Overlay scatter for multiple methods on one plot.

      Useful for visualising method operating regions (e.g. equalization
      verification, Pareto-style analysis).

      :param method_data: Mapping from method name to the dict returned by :meth:`compute`.
      :param show_means: If *True*, draw a diamond marker at the per-method centroid.
      :param annotate_top_n: Number of per-method outliers (farthest from centroid) to annotate.


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateMetricSimilarityAlignment(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   To what degree similar *entities* interfere more with each other.

   Formalized in `ap:prediction`, which also proposes its natural expansion to a
   multivariable and non-linear predictive regression.

   **Arguments:** `m`, `t`, `u`, `m_p`, `s`.
   **Result:** Pearson p-value, Spearman p-value, Pearson correlation, scatter plot.
   **Interpretation:** quantitative; if this value is high, interference between two
   *entities* can be approximated by *similarity* (which is cheaper to compute for any
   new *entity*). Equivalently, the amount of "transmission wires" can be summarized
   by this single *similarity* function.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_pair
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_mp


   .. py:attribute:: similarity_metric
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_s


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch(exclude_diagonal: bool = True) -> dict


.. py:class:: ResultTemplateMetricSimilarityAlignmentMulti(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


       Multi-input Single-output Regression Generalization of ResultTemplateMetricSimilarityAlignment (see also Appendix E, adapted from the multi-output setting).
       Also, the interpretability and feature engineering aspects are improved.

       ---

       We consider a fixed *model* \(m\), *task* \(t\), and *unlearning method* \(u\), which are omitted for brevity.

       The objective is to quantify whether interference between *entities* is aligned with their *similarity*, i.e., to what degree similar *entities* interfere more with each other.

       For every ordered pair of distinct *entities* \(e_i, e_j \in t\) with \(i
   eq j\), we observe several *SimilarityBetweenEntities* measures, indexed by superscripts \(\ell = 1, 2, \dots, |S|\), and a single *MetricInterferencePerEntityPair* target \(m_p(e_i,e_j)\).

       Each ordered pair \((e_i, e_j)\) is therefore treated as one data point with feature vector

       $$
       \mathbf{X}_{ij}
       =
       ig(
       s^{(1)}(e_i, e_j),
       \dots,
       s^{(|S|)}(e_i, e_j)
       ig)
       $$

       and scalar target

       $$
       Y_{ij}
       =
       m_p(e_i, e_j).
       $$

       The resulting dataset is

       $$
       \mathcal{D}
       =
       \{
       (\mathbf{X}_{ij}, Y_{ij})
       \mid
       e_i, e_j \in t,\ i
   eq j
       \}.
       $$

       From this dataset, a regression model can be estimated using standard regression procedures with appropriate validation.

       In the linear case,

       $$
       Y_{ij}
       =
       eta_0
       +
       \sum_{\ell=1}^{|S|}
       eta_{\ell}
       X^{(\ell)}_{ij}
       +

   arepsilon_{ij}.
       $$

       Given a specific *entity* \(e_i\) whose removal is considered, similarities

       $$
       X^{(\ell)}_{ij}
       =
       s^{(\ell)}(e_i, e_j)
       $$

       can be computed for all remaining *entities* \(e_j \in t\). The fitted model then yields predictions

       $$
       \hat{Y}_{ij}
       =
       f(\mathbf{X}_{ij}),
       $$

       which approximate the expected interference on each receiver *entity*.


       Furthermore, the concept of *similarity* may also encode several forms of practical data engineering. For example, one may define:
       - a distinct *similarity* function for each *attribute*, or
       - a *similarity* function based only on the attributes of the emitter entity.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_pair
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_mp


   .. py:attribute:: similarity_metric_list
      :type:  List[vision_unlearning.benchmarks.I_care.configuration.type_s]


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:attribute:: include_attribute_diff_similarity
      :type:  bool
      :value: True


   .. py:attribute:: include_attribute_value_similarity
      :type:  bool
      :value: True


   .. py:attribute:: regression_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_regression_algorithm
      :value: 'linear_regression'


   .. py:attribute:: random_state
      :type:  int
      :value: 42


   .. py:attribute:: test_size
      :type:  float
      :value: 0.3


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: _get_partial_path_local()


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 15), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch(exclude_diagonal: bool = True, entity_col: str = 'name') -> dict


.. py:class:: ResultTemplateSignificantRelationshipNumerical(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Measures whether two numerical attributes are significantly correlated.

   Formalized in `ap:rt_relationship`.

   **Arguments:** `m`, `t`, `u`, `m_e`, `a`.
   **Result:** Pearson p-value, Spearman p-value, Pearson correlation, scatter plot.
   **Interpretation:** qualitative; the researcher should decide if it is ethical or
   desirable that this *attribute* propagates interferences.

   **Pearson test**
       Use when you want to measure a **linear** relationship.
       **Assumptions:**
         * Both variables are **continuous**
         * Relationship is **linear**
         * **Bivariate normality** (both jointly Gaussian)
         * **Homoscedasticity** (constant variance)
         * **No strong outliers** (very sensitive)
       **Detects:** linear correlation only
       **Fails when:** relationship is monotonic but non-linear, or heavy outliers exist

   **Spearman test**
       Use when you want to measure a **monotonic** relationship (not necessarily linear) or data is non-Gaussian.
       **Assumptions:**
         * Variables are at least **ordinal**
         * Relationship is **monotonic** (increasing or decreasing)
         * **No distributional assumptions**
         * **Robust to outliers**
       **Detects:** any monotonic trend (linear or curved)
       **Fails when:** relationship is non-monotonic (e.g., U-shaped)


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_entity
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_me


   .. py:attribute:: attribute
      :type:  str


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:method:: _get_data_path_remote() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateSignificantRelationshipCategorical(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Statistical significance of the average `MetricInterferencePerEntity` across all
   *entities*, when grouped by each of its values.

   Formalized in `ap:rt_relationship`.

   **Arguments:** `m`, `t`, `u`, `m_e`, `a`, optional `filterAttributeValue`.
   **Result:** ANOVA p-value, Kruskal-Wallis p-value, average value of `m_e` grouped
   by each value of `a`, grouped boxplot.
   **Interpretation:** qualitative; similar to
   *SignificantRelationshipNumerical*. The optional argument
   *filterAttributeValue* restricts which emitter *entities* are included, allowing
   the analysis of interference flow distribution, such as whether politicians cause
   more interference to other politicians than artists cause to other artists.

   **ANOVA**
       Use when you want to test if **group means differ** across **3+ independent groups** under parametric assumptions.
       **Assumptions:**
         * Dependent variable is **continuous**
         * Groups are **independent**
         * **Normality** within each group
         * **Homoscedasticity** (equal variances)
         * No strong **outliers**
       **Hypothesis:**
         * H₀: all group means are equal
         * H₁: at least one mean differs
       **Detects:** differences in **means**
       **Fails when:** heavy skew, unequal variances, small n with non-Gaussian data

   **Kruskal-Wallis**
       Use when you want to test if **group distributions differ** without parametric assumptions.
       **Assumptions:**
         * Dependent variable is **ordinal or continuous**
         * Groups are **independent**
         * **Same shaped distributions** (only medians should differ for clean interpretation)
         * No normality or equal-variance requirement
       **Hypothesis:**
         * H₀: all group distributions are equal
         * H₁: at least one group differs
       **Detects:** differences in **medians / distributions**
       **Fails when:** distributions differ in shape (then result is ambiguous)


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_entity
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_me


   .. py:attribute:: attribute
      :type:  str


   .. py:attribute:: attribute_value
      :type:  Optional[str | int]
      :value: None


   .. py:attribute:: min_samples_per_category
      :type:  int
      :value: 5


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:method:: _get_data_path_remote() -> str


   .. py:method:: plot(data: dict, extra_title: str = '', figsize: Tuple[int, int] = (6, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateCountSignificantRelationship(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Number of significant relationships across all combinations of *attributes* and
   *MetricInterferencePerEntity*.

   **Arguments:** `m`, `t`, `u`, list of `m_e`, list of `a`.
   **Result:** integer, list of significances.
   **Interpretation:** quantitative; the lower the better. Since the attributes for
   which it is ethical to propagate interference are constant across all *models* and
   *methods*, a higher value directly implies a higher number of ethical violations,
   that is, a larger number of "transmission wires" in a given task effectively used
   by this *method* and *model*.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm_list
      :type:  List[vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm]


   .. py:attribute:: interference_entity_list
      :type:  List[vision_unlearning.benchmarks.I_care.configuration.type_me]


   .. py:attribute:: attribute_list
      :type:  List[str]


   .. py:attribute:: top_n
      :type:  int
      :value: 10


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateImplicitAssociationTest(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Measures how the strength of automatic associations `B` between two pairs of
   *entities* changes after unlearning.

   **Arguments:** `m`, `t`, `u`, `a_1`, `a_2`, `l`.
   **Result:** `|a| x |a|` real-valued tensor `ΔB`.
   **Interpretation:** qualitative; a human should decide whether it is ethical or
   desirable for the unlearning process to cause this change in implicit association
   between the chosen *attributes*.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: attribute_1
      :type:  str


   .. py:attribute:: attribute_2
      :type:  str


   .. py:attribute:: latent_embedding
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_l


.. py:class:: ResultTemplateMinimumCutInterference(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Interprets a *task* as a directed weighted graph and computes the minimum cut separating two *entities*
   As a consequence of the max-flow min-cut theorem, it directly follows that the minimum cut is the smallest influence whose removal eliminates every directed influence path from $e_1$ to $e_2$.
   Based on this, we conjecture that if we need to unlearn $e_1$ while minimizing harm to $e_2$, then the ideal intervention in the unlearning process is to increase the preservation of the emitter-side nodes. More intuitively, we can think of this intervention as "blocking the interference path," as performed in electrical circuits to protect sensitive components (such as ground partitioning, shielding traces, among others.
   **Arguments:** $m$, $t$, $u$, $e_1$, $e_2$, $m_p$.
   **Result:** list of *entities* (corresponding to the emitter-side nodes).
   **Interpretation:** qualitative; small set of nodes through which most of the interference from $e_1$ propagates to $e_2$.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_pair
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_mp


   .. py:attribute:: entity_1
      :type:  str


   .. py:attribute:: entity_2
      :type:  str


.. py:class:: ResultTemplateUnlearningVisualSummary(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   !!! abstract "Usage Documentation"
       [Models](../concepts/models.md)

   A base class for creating Pydantic models.

   .. attribute:: __class_vars__

      The names of the class variables defined on the model.

   .. attribute:: __private_attributes__

      Metadata about the private attributes of the model.

   .. attribute:: __signature__

      The synthesized `__init__` [`Signature`][inspect.Signature] of the model.

   .. attribute:: __pydantic_complete__

      Whether model building is completed, or if there are still undefined fields.

   .. attribute:: __pydantic_core_schema__

      The core schema of the model.

   .. attribute:: __pydantic_custom_init__

      Whether the model has a custom `__init__` function.

   .. attribute:: __pydantic_decorators__

      Metadata containing the decorators defined on the model.
      This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.

   .. attribute:: __pydantic_generic_metadata__

      A dictionary containing metadata about generic Pydantic models.
      The `origin` and `args` items map to the [`__origin__`][genericalias.__origin__]
      and [`__args__`][genericalias.__args__] attributes of [generic aliases][types-genericalias],
      and the `parameter` item maps to the `__parameter__` attribute of generic classes.

   .. attribute:: __pydantic_parent_namespace__

      Parent namespace of the model, used for automatic rebuilding of models.

   .. attribute:: __pydantic_post_init__

      The name of the post-init method for the model, if defined.

   .. attribute:: __pydantic_root_model__

      Whether the model is a [`RootModel`][pydantic.root_model.RootModel].

   .. attribute:: __pydantic_serializer__

      The `pydantic-core` `SchemaSerializer` used to dump instances of the model.

   .. attribute:: __pydantic_validator__

      The `pydantic-core` `SchemaValidator` used to validate instances of the model.

   .. attribute:: __pydantic_fields__

      A dictionary of field names and their corresponding [`FieldInfo`][pydantic.fields.FieldInfo] objects.

   .. attribute:: __pydantic_computed_fields__

      A dictionary of computed field names and their corresponding [`ComputedFieldInfo`][pydantic.fields.ComputedFieldInfo] objects.

   .. attribute:: __pydantic_extra__

      A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
      is set to `'allow'`.

   .. attribute:: __pydantic_fields_set__

      The names of fields explicitly set during instantiation.

   .. attribute:: __pydantic_private__

      Values of private attributes set on the model instance.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


.. py:class:: ResultTemplateInterferenceVisualSummary(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Compared generated images for 9 identities: target, 4 worst (excluding target), 4 best


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_pair
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_mp


   .. py:attribute:: entity
      :type:  Optional[str]
      :value: None


   .. py:attribute:: entity_index
      :type:  Optional[int]
      :value: None


   .. py:attribute:: seed
      :type:  int
      :value: 42


   .. py:attribute:: images_max_dim
      :type:  int
      :value: 124


   .. py:method:: _resolve_entity()

      Ensures both entity andentity_index are filled.
      Modifies in place
      At the end, both are set and consistent with each other


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Optional[Tuple[int, int]] = (18, 4), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch()


.. py:class:: ResultTemplateMatrix(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   !!! abstract "Usage Documentation"
       [Models](../concepts/models.md)

   A base class for creating Pydantic models.

   .. attribute:: __class_vars__

      The names of the class variables defined on the model.

   .. attribute:: __private_attributes__

      Metadata about the private attributes of the model.

   .. attribute:: __signature__

      The synthesized `__init__` [`Signature`][inspect.Signature] of the model.

   .. attribute:: __pydantic_complete__

      Whether model building is completed, or if there are still undefined fields.

   .. attribute:: __pydantic_core_schema__

      The core schema of the model.

   .. attribute:: __pydantic_custom_init__

      Whether the model has a custom `__init__` function.

   .. attribute:: __pydantic_decorators__

      Metadata containing the decorators defined on the model.
      This replaces `Model.__validators__` and `Model.__root_validators__` from Pydantic V1.

   .. attribute:: __pydantic_generic_metadata__

      A dictionary containing metadata about generic Pydantic models.
      The `origin` and `args` items map to the [`__origin__`][genericalias.__origin__]
      and [`__args__`][genericalias.__args__] attributes of [generic aliases][types-genericalias],
      and the `parameter` item maps to the `__parameter__` attribute of generic classes.

   .. attribute:: __pydantic_parent_namespace__

      Parent namespace of the model, used for automatic rebuilding of models.

   .. attribute:: __pydantic_post_init__

      The name of the post-init method for the model, if defined.

   .. attribute:: __pydantic_root_model__

      Whether the model is a [`RootModel`][pydantic.root_model.RootModel].

   .. attribute:: __pydantic_serializer__

      The `pydantic-core` `SchemaSerializer` used to dump instances of the model.

   .. attribute:: __pydantic_validator__

      The `pydantic-core` `SchemaValidator` used to validate instances of the model.

   .. attribute:: __pydantic_fields__

      A dictionary of field names and their corresponding [`FieldInfo`][pydantic.fields.FieldInfo] objects.

   .. attribute:: __pydantic_computed_fields__

      A dictionary of computed field names and their corresponding [`ComputedFieldInfo`][pydantic.fields.ComputedFieldInfo] objects.

   .. attribute:: __pydantic_extra__

      A dictionary containing extra values, if [`extra`][pydantic.config.ConfigDict.extra]
      is set to `'allow'`.

   .. attribute:: __pydantic_fields_set__

      The names of fields explicitly set during instantiation.

   .. attribute:: __pydantic_private__

      Values of private attributes set on the model instance.


   .. py:attribute:: metric_key_name
      :type:  str


   .. py:method:: plot_make_title(data: dict) -> str
      :classmethod:

      :abstractmethod:


   .. py:method:: plot(data: dict, figsize: Optional[Tuple[float, float]] = None, cmap: str = 'viridis', title: str = '', xlabel: str = 'Receiver entity', ylabel: str = 'Emitter entity', return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


.. py:class:: ResultTemplateInterferenceMatrix(/, **data: Any)

   Bases: :py:obj:`ResultTemplateMatrix`


   *MetricInterferencePerEntityPair* between each possible combination of two *entities*
   within a *task*.

   **Arguments:** `m`, `t`, `u`, `m_p`.
   **Result:** `|t| x |t|` real-valued tensor.
   **Interpretation:** qualitative; visual patterns may be spotted, especially when
   rearranging indices in a meaningful manner (for example, grouping professions
   together). Further quantitative values may be derived, such as the average value or
   the ratio between the diagonal-average value and the non-diagonal-average value.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: interference_pair
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_mp


   .. py:attribute:: metric_key_name
      :type:  str
      :value: 'interference_pair'


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot_make_title(data: dict) -> str
      :classmethod:


   .. py:method:: _compute_from_scratch()


.. py:class:: ResultTemplateSimilarityMatrix(/, **data: Any)

   Bases: :py:obj:`ResultTemplateMatrix`


   *Similarities* between each possible combination of two *entities* within a *task*.
   * **Arguments**: $m, t, s$
   * **Result**: $|t|  imes |t|$ real-valued tensor
   * **Interpretation**: qualitative; visual patterns may be spotted, similarly to *InterferenceMatrix*.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'scenes'


   .. py:attribute:: similarity_metric
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_s
      :value: 'clip'


   .. py:attribute:: metric_key_name
      :type:  str
      :value: 'similarity_metric'


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: _get_partial_path_local()


   .. py:method:: plot_make_title(data: dict) -> str
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateMethodComparisonByMetricEntity(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Compares the distribution of one *MetricInterferencePerEntity* across multiple
   *unlearning methods*.

   * **Arguments**: m, t, me, list of u
   * **Result**: per-method mean, median, std, n, values; box plot
   * **Interpretation**: lower or higher depending on me direction.
     Use to rank methods by a single interference-per-entity metric.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: interference_entity
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_me


   .. py:attribute:: unlearning_algorithm_list
      :type:  List[vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm]


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (6, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateEmbeddingUnlearningProfile(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Embedding-space profile of one unlearning event (task, method, entity).

   For the specified *forgotten entity*, shows how all 100 entity embeddings
   shift between the baseline model (LoRA-OFF) and the model that forgot this
   entity (LoRA-ON).  Quantifies whether the forgetting was *targeted* or
   *diffuse* in embedding space.

   **Arguments**: model, task, unlearning_algorithm, entity.

   **Result**:
   - PCA scatter (2-D) of all 100 entity mean embeddings.  Baseline positions
     shown as open circles; unlearned positions as filled dots.  The forgotten
     entity is highlighted with a star; an arrow marks its displacement.
     Points are coloured by the entity's self-interference (clip_diff) so that
     collateral damage is immediately visible.
   - Numeric summary: self-displacement magnitude (L2 norm), mean retained
     displacement, ``embedding_specificity_ratio`` (*directional* specificity,
     cosine-distance of self-displacement vs mean retained-entity displacement;
     same metric stored in the InterferencePerEntity (Me) for this task).

   **Metric note (directional vs. magnitude)**:
   The ``embedding_specificity_ratio`` uses cosine distance and therefore captures
   the *direction* of embedding change, not its magnitude.  A ratio > 1 means the
   forgotten entity's embedding shifts in a more novel direction than the average
   retained entity — this is *directional specificity*.  This is distinct from an
   L2-based magnitude specificity (which would ask whether the shift is larger in
   absolute terms).  The displacement bars on the right plot use L2 norm; the
   specificity ratio shown in the title uses cosine distance.

   **Provenance field**: each result includes ``ratio_source`` ("ipe" when the ratio
   was read from the InterferencePerEntity (Me) for this task, "inline" when it was
   computed from the embedding files directly because the IPE column was absent).
   "ipe" is the canonical value; "inline" is a transitional fallback.

   **Interpretation**:
   - Specificity ratio >> 1 and large self-displacement → targeted forgetting.
   - Specificity ratio ~ 1 or low self-displacement → the method caused
     broad embedding drift without isolating the forgotten entity.
   - Compare with the image-level ``clip_diff`` in the scatter colours to
     detect the concealment pattern (embedding moves, image stays similar).

   **Relationship to other RTs**:
   - ``embedding_specificity_ratio`` belongs to ``type_me`` / ``domain_me``,
     so it can be passed to ``MetricMetricAlignment`` and
     ``MethodComparisonByMetricEntity`` like any other per-entity metric.
   - For cross-entity summaries, see ``ResultTemplateEmbeddingForgettingEfficiency``.
   - The "pinpoint-ness" concept aligns with the Holistic Unlearning Benchmark
     (ICCV 2025) definition of targeted forgetting.


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: entity
      :type:  str


   .. py:attribute:: n_pca_components
      :type:  int
      :value: 2


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: _resolve_hf_entity() -> str

      Return the HF-compatible entity name used in embedding file names.


   .. py:method:: _get_baseline_embedding_path() -> str


   .. py:method:: _get_entity_embedding_path() -> str


   .. py:method:: _mean_embeddings(raw: dict) -> Dict[str, np.ndarray]
      :staticmethod:


      Group embedding records by prompted_entity and compute mean per entity.


   .. py:method:: _cosine_distance(a: numpy.ndarray, b: numpy.ndarray) -> float
      :staticmethod:


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (12, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:class:: ResultTemplateEmbeddingForgettingEfficiency(/, **data: Any)

   Bases: :py:obj:`ResultTemplate`


   Embedding-space forgetting efficiency distribution for one (task, method).

   Reads ``embedding_specificity_ratio`` (cosine-distance self-displacement vs.
   mean retained-entity displacement) from the InterferencePerEntity (Me) for
   this task.  This RT aggregates that pre-computed metric across all entities
   in the task and correlates it with the image-level forgetting signal
   (``clip_diff``).

   **Arguments**: model, task, unlearning_algorithm.

   **Prerequisites**: The InterferencePerEntity (Me) must exist and must contain
   the ``embedding_specificity_ratio`` column for the requested method.
   Run "4. Compute interference per entity.py" first if it is missing.

   **Result**:
   - Bar chart of ``embedding_specificity_ratio`` per entity, sorted
     descending; dashed line at ratio = 1 (no specificity).
   - Scatter of ``embedding_specificity_ratio`` vs. self-``clip_diff`` per
     entity, with Spearman correlation and a permutation test (n_permutations
     resamples; parametric t-tests are invalid here because embedding vectors
     from the same model are correlated by architecture and data).
   - Numeric summary: ``n_total`` (all entities in task), ``n_valid`` (entities
     with non-NaN ratio — typically those for which interference_per_pair files
     were available), mean/std of ratio, fraction of entities with ratio > 1
     *among valid entities*, Spearman r between ratio and self-clip_diff,
     permutation p-value.

   **Metric note (directional vs. magnitude)**:
   ``embedding_specificity_ratio`` uses cosine distance (*directional* specificity).
   A ratio > 1 means the forgotten entity shifts in a more novel direction than the
   average retained entity.  This is distinct from an L2-based magnitude ratio.
   Both numerator (self cosine distance) and denominator (mean retained cosine
   distance) are stored separately so a reader can distinguish "ratio is low because
   target barely moves" from "ratio is low because retained entities move MORE".

   **Important caveat on n_valid**:
   ``n_valid`` is typically far smaller than ``n_total`` because
   ``embedding_specificity_ratio`` requires *interference_per_pair* files for each
   entity.  Results from a small ``n_valid`` (e.g. 19/100) are underpowered and
   should be treated as *preliminary*.  The permutation test p-values are reported
   with ``n_valid`` in the title for transparency.

   **Interpretation**:
   - A method with most ratios >> 1 surgically targets each forgotten entity
     in embedding space without disturbing retained embeddings.
   - A high Spearman r (ratio vs. clip_diff) means embedding-space specificity
     and image-level forgetting agree: the method is consistently targeted at
     both levels.  For UCE our data show r ≈ -0.14 (not significant) whereas
     for distil r ≈ -0.12 (not significant at n_valid=19): the two signals
     decouple for UCE, consistent with the concealment hypothesis
     (Sharma et al., arXiv 2409.05668).

   **Relationship to other RTs**:
   - For per-entity detail, see ``ResultTemplateEmbeddingUnlearningProfile``.
   - ``embedding_specificity_ratio`` belongs to ``type_me`` and ``domain_me``,
     so it can be passed to ``MetricMetricAlignment`` and
     ``MethodComparisonByMetricEntity`` like any other per-entity metric.

   References
   concealment: "Sharma et al., arXiv 2409.05668"
   pinpoint: "Holistic Unlearning Benchmark (ICCV 2025)"


   .. py:attribute:: model
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_model
      :value: 'sd1.4'


   .. py:attribute:: task
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_task
      :value: 'people'


   .. py:attribute:: unlearning_algorithm
      :type:  vision_unlearning.benchmarks.I_care.configuration.type_unlearning_algorithm


   .. py:attribute:: n_permutations
      :type:  int
      :value: 10000


   .. py:attribute:: significance_threshold
      :type:  float
      :value: 0.05


   .. py:method:: _serialize_parameters() -> str


   .. py:method:: plot(data: dict, figsize: Tuple[int, int] = (14, 5), return_fig: bool = False) -> Optional[Tuple[matplotlib.figure.Figure, matplotlib.pyplot.Axes]]
      :classmethod:


   .. py:method:: _compute_from_scratch() -> dict


.. py:data:: rt_name_to_class

.. py:data:: rt_name_to_params

.. py:function:: jacc_metric_score(entity_1: str, entity_2: str, metadata_filtered: List[Dict[str, Any]], entity_col: str = 'name') -> float

   Jaccard similarity between two entities, based on their attributes.
   Each attribute (column) contributes between 0 and 1 to the similarity
   We do not know the types and ranges of the attributes beforehand.
   For each attribute, both values for the two entities must be non-NaN and of the same type, otherwise we ignore that attribute (contribution 0).
   The calculation for each attribute is as follows:
   * If the attribute is categorical (str or bool), the contribution is 1 if the two entities have the same value for that attribute, and 0 otherwise.
   * If the attribute is numerical, and both values are between 0 and 1, the contribution is 1 - abs(value_1 - value_2)
   * If the attribute is numerical, and both values are between 1 and 100, the contribution is 1 - abs(value_1 - value_2) / 100
   * else, the contribution is 0 (we do not know how to handle it, so we ignore it)


.. py:function:: display_interesting_interferences(metadata_filtered: List[Dict[str, Any]], interference_per_pair: Dict[str, Dict[str, float]], index: int, task: Literal['scenes', 'objects', 'breeds', 'people'], method: Literal['munba', 'uce', 'distil'], num_train_epochs: int, metric: str, is_worst_biggest: bool, seed: int = 42, save_path: Optional[str] = None) -> None

   Compared generated images for 9 identities: target, 4 worst (excluding target), 4 best
   @param metadata_filtered: should be appropriate for this task (this is not verified inside the function)
   @param interference_per_pair: should be appropriate for this task+index+method+num_train_epochs (this is not verified inside the function)
   @param index: identities the target

   The combination of task+index+method+num_train_epochs identifies a unique unlearned model


.. py:function:: analyze_relationship_regression(df: pandas.DataFrame, x: str, y: str, expected_positive: bool = True, plot: bool = True) -> bool

   Test linear relationship between two numerical variables with significance test
   and direction check.

   Returns True only if:
     (1) the slope is statistically significant (p < 0.05)
     (2) the slope sign matches expectation.


.. py:function:: analyze_relationship_category(df, metric: str, category: str, plot: bool = True) -> bool

.. py:function:: analyze_relationship_numerical(df: pandas.DataFrame, attribute: str, metric: str, plot: bool = False, plot_only_significant: bool = False) -> bool

   Analyzes the relationship between a numerical attribute and a numerical metric
   @param df: interference_per_entity; assumes df[attribute] and df[metric] are numerical
   @param plot: whether to plot the results
   @param plot_only_significant: whether to plot only significant relationships; Only applies if plot=True
   @return: whether any significant relationship was found

   ---

   **Pearson test**
       Use when you want to measure a **linear** relationship.

       **Assumptions:**
       * Both variables are **continuous**
       * Relationship is **linear**
       * **Bivariate normality** (both jointly Gaussian)
       * **Homoscedasticity** (constant variance)
       * **No strong outliers** (very sensitive)

       **Detects:** linear correlation only
       **Fails when:** relationship is monotonic but non-linear, or heavy outliers exist

   ---------

   **Spearman test**
       Use when you want to measure a **monotonic** relationship (not necessarily linear) or data is non-Gaussian.

       **Assumptions:**
       * Variables are at least **ordinal**
       * Relationship is **monotonic** (increasing or decreasing)
       * **No distributional assumptions**
       * **Robust to outliers**

       **Detects:** any monotonic trend (linear or curved)
       **Fails when:** relationship is non-monotonic (e.g., U-shaped)


.. py:function:: analyze_relationship_categorical(df: pandas.DataFrame, attribute: str, metric: str, plot: bool = False, plot_only_significant: bool = False, show_axhline: Optional[float] = None, min_samples_per_category: int = 5, extra_title: str = '') -> bool

   Analyzes the relationship between a categorical attribute and a numerical metric
   @param df: interference_per_entity; assumes df[attribute] is categorical and df[metric] is numerical
   @param plot: whether to plot the results
   @param plot_only_significant: whether to plot only significant relationships; Only applies if plot=True
   @param show_axhline: if provided, shows a horizontal line at this y-value; Only applies if plot=True
   @return: whether any significant relationship was found

   ------

   **ANOVA (f_oneway)**
       Use when you want to test if **group means differ** across **3+ independent groups** under parametric assumptions.

       **Assumptions:**
       * Dependent variable is **continuous**
       * Groups are **independent**
       * **Normality** within each group
       * **Homoscedasticity** (equal variances)
       * No strong **outliers**

       **Hypothesis:**
       * H₀: all group means are equal
       * H₁: at least one mean differs

       **Detects:** differences in **means**
       **Fails when:** heavy skew, unequal variances, small n with non-Gaussian data

   ------

   **Kruskal-Wallis (kruskal)**
       Use when you want to test if **group distributions differ** without parametric assumptions.

       **Assumptions:**
       * Dependent variable is **ordinal or continuous**
       * Groups are **independent**
       * **Same shaped distributions** (only medians should differ for clean interpretation)
       * No normality or equal-variance requirement

       **Hypothesis:**
       * H₀: all group distributions are equal
       * H₁: at least one group differs

       **Detects:** differences in **medians / distributions**
       **Fails when:** distributions differ in shape (then result is ambiguous)


.. py:function:: analyze_correlation_between_pairwise_metrics(df1: pandas.DataFrame, df2: pandas.DataFrame, metric1_name: str, metric2_name: str, exclude_diagonal: bool = True, plot=True, plot_only_significant=True) -> bool

   df1 and df2 are square DataFrames; index and cols are the same within both and among both


.. py:function:: check_eval_results(eval_results, name, threshold: float, operator: Literal['gt', 'lt']) -> float

   Check if the metric satisfy the EXPECTED threshold