MacawMacaw OpenVoice
Getting Started

Environment Variables

Complete reference for all MACAW_* environment variables. Every variable has a sensible default -- you only need to set the ones relevant to your deployment.

All variables are validated at startup via pydantic-settings. Invalid values cause an immediate, descriptive error.


Quick Start

.env
# Minimal production setup
MACAW_HOST=0.0.0.0
MACAW_PORT=8000
MACAW_MODELS_DIR=/opt/macaw/models
MACAW_LOG_FORMAT=json
MACAW_LOG_LEVEL=INFO

Copy .env.example from the repository root to .env in your working directory. Macaw loads it automatically at startup.


Server

HTTP server and WebSocket settings.

VariableDefaultTypeRangeDescription
MACAW_HOST127.0.0.1string--API server bind address
MACAW_PORT8000int1--65535API server HTTP port
MACAW_MAX_FILE_SIZE_MB25int1--500Maximum audio upload size in megabytes
MACAW_RETRY_AFTER_S5string--Retry-After header value for 503/502 responses
MACAW_CORS_ORIGINS(empty)string--Comma-separated allowed CORS origins. Empty = no CORS
CORS examples
# Development: allow everything
MACAW_CORS_ORIGINS=*

# Production: specific origins
MACAW_CORS_ORIGINS=https://app.example.com,https://admin.example.com

The CLI flag --cors-origins overrides MACAW_CORS_ORIGINS. If neither is set, CORS is disabled (no Access-Control-Allow-Origin header).

WebSocket

VariableDefaultTypeRangeDescription
MACAW_MAX_WS_FRAME_SIZE_BYTES1048576int>= 1024Maximum binary frame size (1 MB default)
MACAW_WS_INACTIVITY_TIMEOUT_S60.0float> 0Close connection after this many idle seconds
MACAW_WS_HEARTBEAT_INTERVAL_S10.0float> 0Ping interval (must be < inactivity timeout)
MACAW_WS_CHECK_INTERVAL_S5.0float> 0Inactivity check interval

TTS (Text-to-Speech)

VariableDefaultTypeRangeDescription
MACAW_TTS_GRPC_TIMEOUT_S60.0float> 0Timeout for the gRPC Synthesize RPC
MACAW_TTS_LIST_VOICES_TIMEOUT_S10.0float> 0Timeout for the gRPC ListVoices RPC
MACAW_TTS_CHUNK_SIZE_BYTES4096int512--65536Streaming audio chunk size. Smaller = lower TTFB, more overhead
MACAW_TTS_MAX_TEXT_LENGTH4096int1--1000000Maximum input text length. Increase for audiobooks/long-form

Workers

Worker subprocess settings for STT and TTS engines.

VariableDefaultTypeRangeDescription
MACAW_MODELS_DIR~/.macaw/modelsstring--Model installation directory (supports ~)
MACAW_WORKER_BASE_PORT50051int1024--65535First gRPC port (STT). TTS uses base+1
MACAW_WORKER_HOSTlocalhoststring--gRPC bind address. Change for Docker/K8s
MACAW_STT_WORKER_MAX_CONCURRENT1int1--16Concurrent inference requests per STT worker
MACAW_STT_ACCUMULATION_THRESHOLD_S5.0float> 0, <= 30Streaming audio accumulation before inference
MACAW_STT_MAX_CANCELLED_REQUESTS10000int100--1000000Bounded cache for cancelled request tracking

MACAW_STT_ACCUMULATION_THRESHOLD_S can be overridden per-model via engine_config.accumulation_threshold_s in the model's macaw.yaml manifest.

Worker Lifecycle

Controls crash recovery, health probing, and warmup.

VariableDefaultTypeRangeDescription
MACAW_WORKER_MAX_CRASHES3int1--100Max crashes before giving up on a worker
MACAW_WORKER_CRASH_WINDOW_S60.0float> 0, <= 3600Time window for counting crashes
MACAW_WORKER_HEALTH_PROBE_INITIAL_DELAY_S0.5float> 0, <= 60First health probe delay after spawn
MACAW_WORKER_HEALTH_PROBE_MAX_DELAY_S5.0float> 0, <= 300Maximum backoff between health probes
MACAW_WORKER_HEALTH_PROBE_TIMEOUT_S120.0float> 0, <= 600Total timeout waiting for healthy worker
MACAW_WORKER_HEALTH_PROBE_RPC_TIMEOUT_S2.0float> 0, <= 30Timeout per individual health probe RPC
MACAW_WORKER_MONITOR_INTERVAL_S1.0float> 0, <= 60Process-alive check interval
MACAW_WORKER_STOP_GRACE_PERIOD_S5.0float> 0, <= 60Grace period before SIGKILL on stop
MACAW_WORKER_WARMUP_STEPS3int0--20Warmup inference passes at startup. 0 = skip

Increase MACAW_WORKER_HEALTH_PROBE_TIMEOUT_S for large models (e.g., Whisper large-v3) that take longer to load into memory.


VAD (Voice Activity Detection)

Controls the two-stage VAD pipeline: energy pre-filter (fast, CPU) followed by Silero neural classifier.

VariableDefaultTypeRangeDescription
MACAW_VAD_SENSITIVITYnormalstringhigh, normal, lowSensitivity preset (case-insensitive)
MACAW_VAD_MIN_SPEECH_DURATION_MS250int50--5000Minimum speech before SPEECH_START event
MACAW_VAD_MIN_SILENCE_DURATION_MS300int50--5000Minimum silence before SPEECH_END event
MACAW_VAD_MAX_SPEECH_DURATION_MS30000int1000--600000Maximum continuous speech before forced end
MACAW_VAD_ENERGY_THRESHOLD_DBFS(unset)float-80 to 0Override energy pre-filter threshold (dBFS)
MACAW_VAD_SILERO_THRESHOLD(unset)float0.0--1.0 (exclusive)Override Silero speech probability threshold

Sensitivity Presets

The MACAW_VAD_SENSITIVITY preset controls both the energy pre-filter and Silero thresholds together:

PresetEnergy ThresholdSilero ThresholdUse Case
high-50 dBFS0.3Whispers, quiet rooms, banking
normal-40 dBFS0.5Standard conversation
low-30 dBFS0.7Noisy environments, call centers

Fine-Tuning Beyond Presets

Use the override variables to decouple individual thresholds from the preset:

Custom: sensitive energy + strict Silero
MACAW_VAD_SENSITIVITY=normal
MACAW_VAD_ENERGY_THRESHOLD_DBFS=-45.0
MACAW_VAD_SILERO_THRESHOLD=0.6

When set, override variables take precedence over the sensitivity preset for their respective stage.


Session (Streaming STT)

Controls the streaming session state machine, ring buffer, backpressure, and cross-segment context.

Timeouts

VariableDefaultTypeRangeDescription
MACAW_SESSION_INIT_TIMEOUT_S30.0float1--600Timeout waiting for first audio in INIT state
MACAW_SESSION_SILENCE_TIMEOUT_S30.0float1--600Silence duration before transitioning to HOLD
MACAW_SESSION_HOLD_TIMEOUT_S300.0float1--3600Time in HOLD before closing. Increase for long pauses
MACAW_SESSION_CLOSING_TIMEOUT_S2.0float1--60Timeout for flushing pending transcripts

Ring Buffer & Recovery

VariableDefaultTypeRangeDescription
MACAW_SESSION_RING_BUFFER_DURATION_S60.0float> 0, <= 600Audio retention for crash recovery
MACAW_SESSION_RING_BUFFER_FORCE_COMMIT_THRESHOLD0.90float0.5--1.0Buffer fullness that triggers forced commit
MACAW_SESSION_RECOVERY_TIMEOUT_S10.0float> 0, <= 120Timeout for WAL-based recovery after crash
MACAW_SESSION_DRAIN_STREAM_TIMEOUT_S5.0float> 0, <= 60Timeout for draining final transcripts
MACAW_SESSION_FLUSH_AND_CLOSE_TIMEOUT_S2.0float> 0, <= 30Timeout for the final flush-and-close

Backpressure

VariableDefaultTypeRangeDescription
MACAW_SESSION_BACKPRESSURE_MAX_BACKLOG_S10.0float> 0, <= 120Maximum audio backlog before dropping frames
MACAW_SESSION_BACKPRESSURE_RATE_LIMIT_THRESHOLD1.2float> 1.0, <= 5.0Rate multiplier before rate-limiting kicks in

Cross-Segment Context

VariableDefaultTypeRangeDescription
MACAW_SESSION_CROSS_SEGMENT_MAX_TOKENS224int1--2048Token window for cross-segment conditioning

224 tokens = half of Whisper's 448-token context window. Increase for larger models. Only affects encoder-decoder engines (Whisper). CTC engines ignore cross-segment context.


Scheduler

Controls request prioritization, batching, timeouts, and cancellation.

VariableDefaultTypeRangeDescription
MACAW_SCHEDULER_MIN_GRPC_TIMEOUT_S30.0float> 0, <= 600Minimum gRPC timeout regardless of audio length
MACAW_SCHEDULER_TIMEOUT_FACTOR2.0float> 0, <= 10Multiplier: timeout = max(min, duration * factor)
MACAW_SCHEDULER_SHUTDOWN_TIMEOUT_S10.0float> 0, <= 120Grace period for draining in-flight requests
MACAW_SCHEDULER_AGING_THRESHOLD_S30.0float> 0, <= 300Time before BATCH is promoted to REALTIME priority
MACAW_SCHEDULER_BATCH_ACCUMULATE_MS50.0float> 0, <= 5000Window to accumulate requests into a batch
MACAW_SCHEDULER_BATCH_MAX_SIZE8int1--64Maximum requests per batch
MACAW_SCHEDULER_NO_WORKER_BACKOFF_S0.1float> 0, <= 10Backoff when no worker is available
MACAW_SCHEDULER_DEQUEUE_POLL_INTERVAL_S0.5float> 0, <= 10Priority queue polling interval
MACAW_SCHEDULER_LATENCY_CLEANUP_INTERVAL_S30.0float> 0, <= 600Interval for cleaning expired latency entries
MACAW_SCHEDULER_LATENCY_TTL_S60.0float> 0, <= 600TTL for latency tracker entries
MACAW_SCHEDULER_CANCEL_PROPAGATION_TIMEOUT_S0.1float> 0, <= 10Timeout for cancel propagation to workers

Streaming WebSocket requests bypass the scheduler entirely -- they use a direct gRPC streaming connection for minimum latency. These settings only affect batch (REST API) requests.


gRPC

Message size limits for the internal runtime-to-worker gRPC protocol.

VariableDefaultTypeRangeDescription
MACAW_GRPC_MAX_BATCH_MESSAGE_MB30int1--500Max message size for batch requests
MACAW_GRPC_MAX_STREAMING_MESSAGE_MB10int1--100Max message size for streaming frames

MACAW_GRPC_MAX_BATCH_MESSAGE_MB must be >= MACAW_MAX_FILE_SIZE_MB to accept the largest uploads.


Preprocessing

Audio preprocessing pipeline (runs before VAD and engine).

VariableDefaultTypeRangeDescription
MACAW_PREPROCESSING_DC_CUTOFF_HZ20int1--500High-pass filter cutoff. 20 Hz for full-band, 80+ Hz for telephony
MACAW_PREPROCESSING_TARGET_DBFS-3.0float-60 to 0Gain normalization target. -3.0 dBFS = industry standard headroom

Post-Processing

VariableDefaultTypeRangeDescription
MACAW_ITN_LANGUAGEptstring--Language for Inverse Text Normalization (NeMo)

CLI

Settings for the macaw command-line client.

VariableDefaultTypeRangeDescription
MACAW_SERVER_URLhttp://localhost:8000string--Target server URL
MACAW_HTTP_TIMEOUT_S120.0float> 0HTTP request timeout for all CLI commands

Logging

Logging variables are read directly by macaw.logging at import time, before pydantic-settings is initialized. They are NOT part of MacawSettings.

VariableDefaultTypeOptionsDescription
MACAW_LOG_FORMATconsolestringconsole, jsonLog output format
MACAW_LOG_LEVELINFOstringDEBUG, INFO, WARNING, ERRORLog verbosity
Production logging
MACAW_LOG_FORMAT=json
MACAW_LOG_LEVEL=WARNING

Docker / Kubernetes

Example configuration for containerized deployments:

docker-compose.yml (excerpt)
services:
  macaw:
    image: macaw-openvoice:latest
    environment:
      MACAW_HOST: "0.0.0.0"
      MACAW_PORT: "8000"
      MACAW_MODELS_DIR: "/models"
      MACAW_WORKER_HOST: "0.0.0.0"
      MACAW_LOG_FORMAT: "json"
      MACAW_LOG_LEVEL: "INFO"
      MACAW_CORS_ORIGINS: "https://app.example.com"
      MACAW_WORKER_HEALTH_PROBE_TIMEOUT_S: "180"
      MACAW_VAD_SENSITIVITY: "normal"
    volumes:
      - ./models:/models
    ports:
      - "8000:8000"
Kubernetes ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: macaw-config
data:
  MACAW_HOST: "0.0.0.0"
  MACAW_PORT: "8000"
  MACAW_MODELS_DIR: "/models"
  MACAW_WORKER_HOST: "0.0.0.0"
  MACAW_LOG_FORMAT: "json"
  MACAW_CORS_ORIGINS: "https://app.example.com"
  MACAW_VAD_SENSITIVITY: "high"
  MACAW_SESSION_SILENCE_TIMEOUT_S: "60"

Validation Rules

Macaw validates all environment variables at startup. Cross-field constraints:

ConstraintRule
Heartbeat < InactivityMACAW_WS_HEARTBEAT_INTERVAL_S must be less than MACAW_WS_INACTIVITY_TIMEOUT_S
Health probe delaysMACAW_WORKER_HEALTH_PROBE_INITIAL_DELAY_S must be less than MACAW_WORKER_HEALTH_PROBE_MAX_DELAY_S
Speech durationMACAW_VAD_MIN_SPEECH_DURATION_MS must be less than MACAW_VAD_MAX_SPEECH_DURATION_MS
gRPC vs uploadMACAW_GRPC_MAX_BATCH_MESSAGE_MB should be >= MACAW_MAX_FILE_SIZE_MB

If validation fails, Macaw prints the exact field and constraint that was violated and exits immediately.


Reference