vision_unlearning.integrations.huggingface

Attributes

logger

Functions

huggingface_model_upload(→ None)

Upload an entire folder or specific model config in one single commit

huggingface_model_download(→ None)

Download a model or specific model config from Hugging Face Hub.

huggingface_dataset_exists(→ bool)

Checks whether a folder exists in a Hugging Face dataset repository.

huggingface_dataset_file_exists(→ bool)

Checks if a specific file exists in a Hugging Face dataset repository.

huggingface_dataset_file_upload(file_path, ...)

Upload a single file to a specific dataset config in Hugging Face Hub.

huggingface_dataset_upload(→ None)

Upload a dataset folder to a HuggingFace repository.

huggingface_dataset_download(→ None)

Download a dataset folder from HuggingFace.

huggingface_dataset_file_download(→ None)

Download a single file from a dataset in Hugging Face Hub.

huggingface_get_model_metrics(→ Dict[str, ...)

Supposes that the credentials are properly configured

huggingface_get_model_images(...)

Searches in anything starting with prefix

_huggingface_download_one_file(→ bool)

Download a single file from HF via HTTP. Returns True on success.

huggingface_dataset_download_parallel(→ None)

Download a dataset config folder from HF using parallel HTTP requests.

Module Contents

vision_unlearning.integrations.huggingface.logger
vision_unlearning.integrations.huggingface.huggingface_model_upload(folder_models: str, model_repository: str, model_config: str | None = None, token: str | None = None) None

Upload an entire folder or specific model config in one single commit When model_config is None, uploads entire contents of folder_models Supposes that the folder exists in folder_models, and that it contains the model files

vision_unlearning.integrations.huggingface.huggingface_model_download(folder_models: str, model_repository: str, model_config: str | None = None, token: str | None = None, clean: bool = False) None

Download a model or specific model config from Hugging Face Hub.

Parameters:
  • folder_models – Local directory to save the model

  • model_repository – Hugging Face repository ID

  • model_config – Specific model config to download (None for entire repository)

  • token – Hugging Face authentication token

  • clean – If True, the folder will be deleted before downloading

vision_unlearning.integrations.huggingface.huggingface_dataset_exists(dataset_repository: str, dataset_config: str, token: str | None, path_in_repo: str | None = None) bool

Checks whether a folder exists in a Hugging Face dataset repository.

Example

dataset_repository=”username/my_dataset” dataset_config=”configs/en”

Parameters:

path_in_repo – HF-side path to check. When None (default) the check uses dataset_config as-is. Pass an explicit value to decouple the local folder name from its location in the HF repository (e.g., path_in_repo="datasets/generated_breeds_baseline").

Works without listing the whole repository.

vision_unlearning.integrations.huggingface.huggingface_dataset_file_exists(dataset_repository: str, dataset_path: str, token: str | None) bool

Checks if a specific file exists in a Hugging Face dataset repository.

Parameters:
  • dataset_repository – e.g. “username/dataset_name”

  • dataset_path – full path in repo (e.g. “config/file.jsonl”)

  • token – HF token (can be None for public repos)

Returns:

True if file exists, False otherwise

Efficiently checks if a file exists in a Hugging Face dataset repo without listing the entire repository. Could be done more efficiently if we use a new version of the lib, see https://chatgpt.com/share/69edd525-d008-832d-8a0c-ec4560a4fe3b

vision_unlearning.integrations.huggingface.huggingface_dataset_file_upload(file_path: str, dataset_repository: str, dataset_path: str, token: str)

Upload a single file to a specific dataset config in Hugging Face Hub. @param dataset_path: full name of the file in the repository, including the config folder (e.g., “my_config/my_file.jsonl”)

vision_unlearning.integrations.huggingface.huggingface_dataset_upload(folder_datasets: str, dataset_repository: str, dataset_config: str, token: str, path_in_repo: str | None = None) None

Upload a dataset folder to a HuggingFace repository.

Supposes that a folder dataset_config exists in folder_datasets, and that it contains the dataset files.

Parameters:

path_in_repo – Destination path inside the HF repo. When None (default) the files land at dataset_config relative to the repo root. Pass an explicit value to decouple the local folder name from its location in the HF repository (e.g., path_in_repo="datasets/generated_breeds_baseline").

vision_unlearning.integrations.huggingface.huggingface_dataset_download(folder_datasets: str, dataset_repository: str, dataset_config: str, token: str, clean: bool = False, folder_cache: str = '/tmp/huggingface_cache', clean_cache: bool = False, path_in_repo: str | None = None) None

Download a dataset folder from HuggingFace.

Parameters:
  • folder_datasets – Local parent directory. The dataset is placed at os.path.join(folder_datasets, dataset_config).

  • dataset_config – Name of the local subfolder to create under folder_datasets.

  • path_in_repo – Path inside the HF repository that contains the dataset files. When None (default) it is the same as dataset_config. Pass an explicit value when the HF-side path differs from the local folder name (e.g., path_in_repo="datasets/generated_breeds_baseline").

  • clean – If True, the local folder is deleted before downloading.

vision_unlearning.integrations.huggingface.huggingface_dataset_file_download(folder_datasets: str, dataset_repository: str, file_path: str, token: str | None, folder_cache: str = '/tmp/huggingface_cache') None

Download a single file from a dataset in Hugging Face Hub.

Parameters:
  • folder_datasets – Local directory where datasets are stored.

  • dataset_repository – Hugging Face dataset repository ID

  • file_path – Full path of the file within the repository (e.g., “config/data.jsonl”)

  • token – Hugging Face authentication token

  • folder_cache – Cache directory for downloads

The file will be saved at os.path.join(folder_datasets, file_path)

vision_unlearning.integrations.huggingface.huggingface_get_model_metrics(model_id: str) Dict[str, float | int | bool]

Supposes that the credentials are properly configured

vision_unlearning.integrations.huggingface.huggingface_get_model_images(model_id, prefix: str = '') List[PIL.ImageFile.ImageFile]

Searches in anything starting with prefix

vision_unlearning.integrations.huggingface._huggingface_download_one_file(entry: dict, folder_dataset: str, dataset_repository: str, headers: dict) bool

Download a single file from HF via HTTP. Returns True on success.

vision_unlearning.integrations.huggingface.huggingface_dataset_download_parallel(folder_datasets: str, dataset_repository: str, dataset_config: str, token: str, clean: bool = False, folder_cache: str = '/tmp/huggingface_cache', hf_prefix: str = 'datasets', max_workers: int = 12) None

Download a dataset config folder from HF using parallel HTTP requests.

Faster alternative to huggingface_dataset_download() for large folders. Uses ThreadPoolExecutor(max_workers) for concurrent file downloads; reduces per-entity download time from ~6 minutes (sequential snapshot_download) to ~35 s at max_workers=12 (measured on HF for 801 PNG files, 2026-05-20).

Parameters:
  • folder_datasets – Local parent directory (e.g. “assets/datasets”).

  • dataset_repository – HF dataset repo ID.

  • dataset_config – Folder name within folder_datasets AND within hf_prefix on HF (e.g. “generated_people_George W Bush_uce_000”).

  • token – HF auth token.

  • clean – If True, delete local folder before downloading.

  • folder_cache – Unused — kept for signature compatibility with huggingface_dataset_download().

  • hf_prefix – Prefix path within the HF repo (default “datasets”).

  • max_workers – Thread pool size for concurrent HTTP downloads. Benchmark (2026-05-20, 801 files): 1=349s, 4=91s, 8=48s, 12=35s. 12 is the recommended default; do not exceed 16 (HF rate limits).