Installation
Macaw OpenVoice requires Python 3.11+ and uses pip extras to install only the engines you need.
Prerequisites
| Requirement | Minimum | Recommended |
|---|---|---|
| Python | 3.11 | 3.12 |
| pip | 21.0+ | latest |
| OS | Linux, macOS | Linux (for GPU support) |
| CUDA | Optional | 12.x (for GPU inference) |
info
Macaw runs on CPU by default. GPU support depends on the engine -- Faster-Whisper uses CTranslate2 which supports CUDA out of the box.
Install with pip
The simplest way to get started:
Minimal install (STT only)
pip install macaw-openvoice[server,grpc,faster-whisper]
Full install (STT + TTS + ITN)
pip install macaw-openvoice[server,grpc,faster-whisper,kokoro,itn]
Available Extras
| Extra | What it adds | Size |
|---|---|---|
server | FastAPI + Uvicorn (required for serving) | ~20 MB |
grpc | gRPC runtime for worker communication | ~15 MB |
faster-whisper | Faster-Whisper STT engine | ~100 MB |
wenet | WeNet CTC STT engine | ~80 MB |
kokoro | Kokoro TTS engine | ~50 MB |
itn | NeMo Inverse Text Normalization | ~200 MB |
stream | Microphone streaming via sounddevice | ~5 MB |
dev | Development tools (ruff, mypy, pytest) | ~50 MB |
Install with uv (recommended for development)
uv is significantly faster than pip and handles virtual environments automatically:
Create a virtual environment and install
uv venv --python 3.12
uv sync --all-extras
Activate the environment
source .venv/bin/activate
GPU Setup
For GPU-accelerated inference with Faster-Whisper:
- Install CUDA drivers for your GPU
- Install the CUDA-enabled version of CTranslate2:
pip install ctranslate2
warning
Ensure your CUDA version matches the CTranslate2 build. Check compatibility at the CTranslate2 releases page.
Verify Installation
Check that Macaw is installed correctly
macaw --help
You should see:
Usage: macaw [OPTIONS] COMMAND [ARGS]...
Macaw OpenVoice CLI
Commands:
serve Start the API server
transcribe Transcribe an audio file
translate Translate audio to English
list List installed models
pull Download a model
inspect Show model details
Next Steps
- Quickstart -- Run your first transcription
- Configuration -- Customize runtime settings