whisper.cpp

C/C++ speech to text inference for OpenAI's Whisper model with CPU, GPU, and on-device support

Repository activity

Stars50.7k
Forks5.7k
Open Issues1.2k

License

MIT

Languages

C++
C
Cuda

Get it:GitHub Docker

About whisper.cpp

whisper.cpp is a C/C++ implementation of OpenAI's Whisper automatic speech recognition model. It runs speech to text inference without heavy dependencies and can run fully offline on device, built around a small high-level implementation and a plain C-style API.

It supports mixed F16 and F32 precision, integer quantization, zero runtime memory allocations, and voice activity detection. Acceleration is available for Apple Silicon, x86 AVX, POWER VSX, Vulkan, NVIDIA CUDA, AMD ROCm, OpenVINO, Ascend NPU, and Moore Threads GPUs. Supported targets include macOS, iOS, Android, Linux, Windows, WebAssembly, Docker, and Raspberry Pi.

The project ships a CLI, examples, Java bindings, and a Docker image, and is built on top of the ggml machine learning library, with the model logic in whisper.h and whisper.cpp. It is released under the MIT License.

Key features

Plain C/C++ implementation without dependencies
CPU-only inference and zero runtime memory allocations
Mixed F16/F32 precision and integer quantization
Voice Activity Detection (VAD)
C-style API and command line tools

Details

First released: 2022
Platforms: Windows · macOS · Linux · Android · iOS
Runtime: CPU · GPU · WebAssembly · Docker
Inference: Offline, on-device speech recognition
Acceleration: Metal · CUDA · ROCm · Vulkan · OpenVINO
Quantization: Integer quantization support