Skip to content

Duplicatephoto

flowtask.components.DuplicatePhoto

DuplicatePhoto

DuplicatePhoto(loop=None, job=None, stat=None, **kwargs)

Bases: FlowComponent

DuplicatePhoto.

Check if Photo is Duplicated and add a column with the result. This component is used to check if a photo is duplicated in the dataset. It uses the image hash to check if the photo is duplicated. The image hash is a unique identifier for the image. The image hash is calculated using the image hash algorithm. The image hash algorithm is a fast and efficient way to calculate the hash of an image. saves a detailed information about matches based on perceptual hash and vector similarity.

pgvector_init async

pgvector_init(conn)

Initialize pgvector extension in PostgreSQL.

run async

run()

Run the duplicate detection with enhanced information.

qid

qid(name)

Very small helper to quote SQL identifiers safely. Raises if name contains anything but letters, digits or '_'.