RVC WebUI logo

RVC WebUI

VITS-based voice conversion web UI for training and running voice models from short audio samples

Repository activity
  • Stars36k
  • Forks5.1k
  • Open Issues739
License

MIT

Languages
  • Python
  • Jupyter Notebook
  • Batchfile
Get it:Website
RVC WebUI screenshot

About RVC WebUI

RVC WebUI is a simple voice conversion framework built on VITS. It is designed to train a voice conversion model from as little as 10 minutes of clean speech and to run conversion through a web interface. The same interface also includes a real-time voice changer.

It uses top1 retrieval to replace input features with training-set features and reduce timbre leakage. It supports model fusion through ckpt-merge, can call UVR5 to separate vocals and accompaniment, and uses the InterSpeech2023 RMVPE pitch extraction algorithm. The project also notes 170 ms end-to-end latency, or 90 ms with ASIO hardware support.

RVC WebUI runs on Python 3.8 or newer and ships Windows and Linux shell and batch launch scripts, plus IPEX support notes for Intel graphics users. It runs entirely as a local training and inference tool, with a hosted demo available for trying it without setup.

Key features

  • Train voice conversion models from about 10 minutes of speech
  • Top1 retrieval to reduce timbre leakage
  • Real-time voice changer interface
  • Model fusion with ckpt-merge
  • UVR5 vocal and accompaniment separation

Details

First released
2023
Platforms
Windows · macOS · Linux · Web
Deployment
self-hostable
Input
Voice data, recommended 10 minutes
Latency
170 ms end-to-end; 90 ms with ASIO
Framework
VITS