How we turn a folder of audio and video into a searchable library โ using best-in-class open models that run entirely on your Mac. No cloud, no API keys, no telemetry.
From ffmpeg decode to word-level timestamps โ the speech-to-text pipeline that never sends a byte to the cloud.
Why โa rocket blasting offโ finds the right clip even when nobody said those words โ semantic text, visual, and OCR search combined.
Speaker diarization and an opt-in face library that label your media without anything ever leaving the machine.