The MediaFind blog

Building private, on-device media search

How we turn a folder of audio and video into a searchable library — using best-in-class open models that run entirely on your Mac. No cloud, no API keys, no telemetry.

🎙️

Transcription

How MediaFind transcribes your media entirely on-device with Whisper

From ffmpeg decode to word-level timestamps — the speech-to-text pipeline that never sends a byte to the cloud.

Read the deep dive →

🔍

Search by meaning: embeddings, CLIP and a local vector index

Why “a rocket blasting off” finds the right clip even when nobody said those words — semantic text, visual, and OCR search combined.

Read the deep dive →

🗣️

People & privacy

Who said it, who's in it — diarization & face recognition, privately

Speaker diarization and an opt-in face library that label your media without anything ever leaving the machine.

Read the deep dive →