vision_unlearning.utils.data_generation
Functions
|
Generate images for the given prompts and save them to output_path. |
Module Contents
- vision_unlearning.utils.data_generation.generate_dataset(model_base_name: str | None, lora_name: str | None, prompts: List[str], output_path: str, filenames: List[str] | None = None, batch_size: int = 4, device: int | str | torch.device = 'cuda', lora_requires_inversion: bool = False, model_pipeline: diffusers.AutoPipelineForText2Image | None = None, seeds: List[int] | None = None) List[Dict[str, str]]
Generate images for the given prompts and save them to output_path.
- When seeds is provided (recommended for reproducibility):
For each seed, the function sets torch/numpy/random global state and passes a seeded torch.Generator to the pipeline call. This guarantees that running with the same model weights and the same seed produces pixel-identical images.
filenames may optionally be provided. When provided, the caller must supply exactly
len(seeds) * len(prompts)filenames in seed-major order:[seed0_prompt0, seed0_prompt1, ..., seed1_prompt0, seed1_prompt1, ...].When seeds is provided but filenames is None: filenames are auto-generated as
{seed}_{prompt}.png(no prefix) for each (seed, prompt) pair.metadata.jsonl is written once after all seeds are processed.
- When seeds is None (legacy mode):
filenames may be provided explicitly (one per prompt).
The pipeline is called once per batch without seeding — non-deterministic.
This path is kept for backward compatibility only.
@param model_base_name: HF model name or local path. Ignored if model_pipeline given. @param lora_name: LoRA adapter path. If set, model_base_name is also required. @param prompts: Text prompts to generate images for. @param output_path: Directory where images and metadata.jsonl are saved. @param filenames: Explicit filenames (optional).
Legacy mode (seeds=None): one filename per prompt.
Seeded mode (seeds provided): len(seeds) * len(prompts) filenames in seed-major order. If None, filenames are auto-generated as
{seed}_{prompt}.png.
@param batch_size: Number of prompts per pipeline call. @param device: Torch device. @param lora_requires_inversion: Passed to unlearn_lora if lora_name is set. @param model_pipeline: Pre-loaded pipeline (skips loading if provided). @param seeds: List of integer seeds. When provided the generation loop is seeded.