All experiments
Indexatron
Active"What's in my photos?"
Started 22 February 2026
GitHub
llmpythonollamaprivacy
I have thousands of family photos. Finding specific ones is a nightmare. “That photo from the wedding with Uncle Dave” - good luck.
Cloud services can do this, but uploading family photos to third parties feels wrong. This experiment proves that locally-run LLMs can analyse photos with useful metadata extraction - no cloud required.
The hypothesis
Local LLMs can analyse family photos with useful metadata extraction.
Status: Confirmed. Now integrated with the-mcculloughs.org for automated photo analysis.
Current state
The service fetches pending uploads from the Rails app, analyses them with LLaVA, generates embeddings, and posts results back. Key learnings:
- Context matters: Injecting photo metadata (title, caption, date, gallery) into prompts dramatically improves results
- LLaVA > Llama 3.2 Vision: For structured JSON extraction, the smaller model is more reliable (no repetition loops)
- Override when you know better: Use actual
date_takeninstead of AI guessing from visual cues - Defensive parsing: Vision models produce unexpected outputs; robust JSON repair is essential
The stack
- Runtime: Ollama
- Vision Model: LLaVA:7b (~4.7GB)
- Embeddings: nomic-embed-text (~274MB)
- Language: Python 3.11+ with pydantic, httpx, Pillow, Rich
What it does
Feed it a photo, get back:
- Subject identification (people, objects, brands)
- Scene categorisation
- Era estimation (or override with actual date)
- Family nickname resolution (“Mamie” -> “Isobel McCullough”)
- 768-dimensional semantic embeddings
- Context-aware analysis using photo metadata
Next
- Make this a proper background service (systemd/launchd)
- Search API - query by person, category, decade
- Semantic search using embeddings - find similar photos