Dance Collection

Visual Similarity

Photographs grouped by visual resemblance using CLIP image embeddings and cosine similarity.

Each photograph was encoded as a 512-dimensional vector using CLIP. Cosine similarity between these vectors measures how visually alike two images are — high similarity means similar composition, subject matter, and visual texture. Each row below shows one photograph alongside its three nearest neighbors across all 42 items in the collection. The cherry border marks the reference image.

42 photographs · top 3 nearest neighbors each