vision_unlearning.metrics.image_and_text

Classes

MetricImageTextSimilarity

!!! abstract "Usage Documentation"

Module Contents

class vision_unlearning.metrics.image_and_text.MetricImageTextSimilarity(/, **data: Any)

Bases: vision_unlearning.metrics.base.Metric

!!! abstract “Usage Documentation”

[Models](../concepts/models.md)

A base class for creating Pydantic models.

__class_vars__

The names of the class variables defined on the model.

__private_attributes__

Metadata about the private attributes of the model.

__signature__

The synthesized __init__ [Signature][inspect.Signature] of the model.

__pydantic_complete__

Whether model building is completed, or if there are still undefined fields.

__pydantic_core_schema__

The core schema of the model.

__pydantic_custom_init__

Whether the model has a custom __init__ function.

__pydantic_decorators__

Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.

__pydantic_generic_metadata__

A dictionary containing metadata about generic Pydantic models. The origin and args items map to the [__origin__][genericalias.__origin__] and [__args__][genericalias.__args__] attributes of [generic aliases][types-genericalias], and the parameter item maps to the __parameter__ attribute of generic classes.

__pydantic_parent_namespace__

Parent namespace of the model, used for automatic rebuilding of models.

__pydantic_post_init__

The name of the post-init method for the model, if defined.

__pydantic_root_model__

Whether the model is a [RootModel][pydantic.root_model.RootModel].

__pydantic_serializer__

The pydantic-core SchemaSerializer used to dump instances of the model.

__pydantic_validator__

The pydantic-core SchemaValidator used to validate instances of the model.

__pydantic_fields__

A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.

__pydantic_computed_fields__

A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.

__pydantic_extra__

A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.

__pydantic_fields_set__

The names of fields explicitly set during instantiation.

__pydantic_private__

Values of private attributes set on the model instance.

metrics: List[Literal['clip']]
_clip_metric: torchmetrics.multimodal.clip_score.CLIPScore | None = None
model_post_init(__context: dict | None = None) None

Override this method to perform additional initialization after __init__ and model_construct. This is useful if you want to do some validation that requires the entire model to be initialized.

_load_image(image: PIL.Image.Image | numpy.ndarray | str) torch.Tensor
score(image: PIL.Image.Image | numpy.ndarray | str, text: str) Dict[str, float]
score_batch(images: List[PIL.Image.Image | numpy.ndarray | str], texts: List[str]) List[Dict[str, float]]

Warning: this function don’t improve performance. The underlying libraries still work serially. Returns per-pair results in the same order.

score_batch_same_text(images: List[PIL.Image.Image | numpy.ndarray | str], text: str) List[Dict[str, float]]

Batch CLIP scoring when all images share the same text prompt.

This is meaningfully faster than calling score() N times because the CLIP text encoder runs once for the shared text. Images are processed individually through the CLIP image processor (as in the serial path) but the text encoder forward pass is done only once.

Uses _clip_score_update from torchmetrics (private API, tested against torchmetrics 1.x) which returns per-pair scores as a 1-D tensor. The result is numerically equivalent to calling score() N times (max diff < 2e-5 on 512x512 SD1.4 images).

NOTE: _clip_score_update is a private torchmetrics symbol — if a future torchmetrics version removes it, fall back to the serial score() loop.

Parameters:
  • images – List of N images (PIL Image, np.ndarray, or file path).

  • text – Single text caption applied to all images.

Returns:

float}, one per image in input order.

Return type:

List of N dicts {‘clip’