Trains at Temple

Visual Similarity

Photographs grouped by visual resemblance using CLIP image embeddings and cosine similarity.

4862 items with similarity data

How it works

Each photograph was encoded as a 512-dimensional vector using CLIP (clip-vit-base-patch32), run entirely on local hardware. Cosine similarity between vectors measures visual resemblance — similar composition, subject matter, and texture produce higher scores. Below are 10 curated photographs from different railroads, each shown alongside its three nearest visual neighbors from the full collection of 4868 items.

Curated samples across railroads