Whisper transcription with CTranslate2 for faster inference and lower memory use
- Stars23.6k
- Forks1.9k
- Open Issues311
MIT
- Python
- Dockerfile

About faster-whisper
faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2. It transcribes audio with the same model family while aiming for faster inference and lower memory use than openai-whisper.
It supports 8-bit quantization on CPU and GPU, and the transcribe API can return segments as a generator. Word timestamps and VAD filtering are available, and Distil-Whisper checkpoints are compatible with the package.
It requires Python 3.9 or greater. GPU use needs NVIDIA cuBLAS for CUDA 12 and cuDNN 9 for CUDA 12, and one install path uses an official NVIDIA CUDA Docker image. FFmpeg does not need to be installed separately because PyAV bundles the FFmpeg libraries.
Key features
- Transcribes audio with OpenAI Whisper models through CTranslate2
- 8-bit quantization on CPU and GPU
- Word timestamps and VAD filtering
- Generator-based segment output
- Works with Distil-Whisper checkpoints
Details
- First released
- 2023
- Platforms
- CLI
- Deployment
- Docker · offline-first
- Python
- 3.9+
- GPU
- CUDA 12 · cuDNN 9
- FFmpeg
- Bundled via PyAV
