vision_unlearning.benchmarks.I_care.embeddings ============================================== .. py:module:: vision_unlearning.benchmarks.I_care.embeddings .. autoapi-nested-parse:: DINOv2 embedding utilities for the I-CARE benchmark. This module provides: - embed_forgetting_session(): embed all images from one forgetting session (entity or baseline) - load_dino_model(): load DINOv2 vits14 and return (model, transform, device) triple - embed_image_with_dino(): embed a single image using a pre-loaded DINOv2 model Design notes: - Heavy GPU imports (torch, torchvision, PIL) are deferred to function call time so this module is safe to import in CPU-only environments. - embed_image_fn is injectable in embed_forgetting_session() for unit testing without GPU. - TODO: refactor embed_image_with_dino() into batched DataLoader for throughput. - TODO: add torch.compile() support. - TODO: add fp16 (half-precision) support for throughput. - TODO: add DataLoader parallelism (num_workers > 0). Attributes ---------- .. autoapisummary:: vision_unlearning.benchmarks.I_care.embeddings.logger vision_unlearning.benchmarks.I_care.embeddings.EMBEDDING_MODEL vision_unlearning.benchmarks.I_care.embeddings.EMBEDDING_DIM Functions --------- .. autoapisummary:: vision_unlearning.benchmarks.I_care.embeddings.load_dino_model vision_unlearning.benchmarks.I_care.embeddings.embed_image_with_dino vision_unlearning.benchmarks.I_care.embeddings.embed_forgetting_session vision_unlearning.benchmarks.I_care.embeddings.embed_forgetting_session_batched Module Contents --------------- .. py:data:: logger .. py:data:: EMBEDDING_MODEL :value: 'dinov2_vits14' .. py:data:: EMBEDDING_DIM :value: 384 .. py:function:: load_dino_model(model_name: str = EMBEDDING_MODEL, force_device: Optional[str] = None) -> Tuple[Any, Any, str] Load DINOv2 model, transform pipeline, and device. Heavy imports (torch, torchvision) happen here, not at module load. :param model_name: DINOv2 model variant (default: 'dinov2_vits14' → 384-dim CLS). :param force_device: If set, use this device string instead of auto-detecting. :returns: (model, transform, device) tuple. model: DINOv2 PyTorch model in eval mode, on device. transform: torchvision.transforms pipeline (resize → crop → normalize). device: device string ('cuda' or 'cpu'). .. py:function:: embed_image_with_dino(image_path: str, model: Any, transform: Any, device: str) -> List[float] Embed a single image using a pre-loaded DINOv2 model. :param image_path: Path to a PNG/JPEG image on disk. :param model: DINOv2 model (from load_dino_model()). :param transform: torchvision transform (from load_dino_model()). :param device: device string ('cuda' or 'cpu'). :returns: 384-dim CLS embedding as a plain Python list of floats. TODO: refactor into batched DataLoader for throughput (currently single-image). .. py:function:: embed_forgetting_session(dataset_folder: str, seeds: List[int], prompts: List[str], metadata_filtered: List[Dict[str, Any]], lora_state: Literal['on', 'off'], task: str, embed_image_fn: Optional[Callable[[str], List[float]]] = None) -> List[Dict[str, Any]] Embed all images from one forgetting session (entity or baseline). Iterates over all (seed, prompt) combinations and embeds each matching image. Images that do not exist on disk are skipped with a warning. :param dataset_folder: Local directory containing the generated images. :param seeds: List of generation seeds (e.g. [0, 1, 2, 3]). :param prompts: Full prompt strings (e.g. "An image of Colin Powell"). :param metadata_filtered: Metadata list used to map prompt index → entity name. metadata_filtered[i]['name'] corresponds to prompts[i]. :param lora_state: 'on' for unlearned model images, 'off' for baseline images. :param task: Task name, passed to get_target_preprocessed(). :param embed_image_fn: Injectable embedding function (image_path → [float]). Required — there is no default. Pass embed_image_with_dino (partially applied) or a test stub. :returns: [ { 'prompted_entity': str, # entity name (preprocessed) 'seed': int, 'prompt': str, 'embedding': List[float], # 384-dim CLS embedding }, ... ] :rtype: List of records .. py:function:: embed_forgetting_session_batched(dataset_folder: str, seeds: List[int], prompts: List[str], metadata_filtered: List[Dict[str, Any]], lora_state: Literal['on', 'off'], task: str, model: Any, transform: Any, device: str, batch_size: int = 32) -> List[Dict[str, Any]] Embed all images for one forgetting session using batched GPU inference. More efficient than embed_forgetting_session() for large image sets. Collects all (path, metadata) pairs first, then processes in batches via a simple loop, amortising Python overhead and maximising GPU utilisation. :param dataset_folder: Local directory containing the generated images. :param seeds: List of generation seeds used. :param prompts: Full prompt strings. :param metadata_filtered: Metadata list: metadata_filtered[i]['name'] → prompts[i]. :param lora_state: 'on' for unlearned model, 'off' for baseline. :param task: Task name, passed to get_target_preprocessed(). :param model: DINOv2 model (from load_dino_model()), on device, in eval mode. :param transform: torchvision transform pipeline (from load_dino_model()). :param device: Torch device string ('cuda' or 'cpu'). :param batch_size: Number of images per GPU forward pass (default 32). TODO: tune based on VRAM; 32 images × 224×224 ≈ 220MB VRAM. :returns: Same structure as embed_forgetting_session().