MacawMacaw OpenVoice
Supported Models

Audio Codec Models

Audio codecs compress and decompress audio signals into compact token representations. In the context of voice AI, neural audio codecs are used as intermediate representations for TTS systems, voice conversion, and efficient audio streaming. They enable high-quality audio reconstruction from extremely compact representations.

Models

StatusModelSample RateParametersLicenseHuggingFace
PlannedNVIDIA Audio Codec 44kHz44.1 kHzLink
PlannedNVIDIA Audio Codec 22kHz22.05 kHzLink

Choosing a model

  • High-fidelity applications: The 44kHz codec preserves more audio detail and is suitable for music or high-quality voice synthesis.
  • Voice-focused applications: The 22kHz codec is sufficient for speech and offers a more compact representation.

On this page