Open Source Speech to Text
Audio is among the most revealing data you can hand a vendor - support calls, doctor's notes, interviews - and a cloud transcription API means every recorded minute is uploaded, billed, and sitting on someone else's servers before you get a word back. The open source engines here do the recognition locally on your own hardware, so confidential audio never leaves the room it was recorded in and you can transcribe an archive of thousands of hours without the meter running per minute.

Whisper
General-purpose speech recognition model for multilingual transcription, translation, and language identification

whisper.cpp
C/C++ speech to text inference for OpenAI's Whisper model with CPU, GPU, and on-device support

faster-whisper
Whisper transcription with CTranslate2 for faster inference and lower memory use

Vosk
Offline speech recognition toolkit with streaming transcription, small models, and speaker identification

SpeechBrain
Open-source PyTorch toolkit for speech and text processing across conversational AI tasks