Voice AI
PolarGrid provides low-latency voice capabilities at the edge.Text-to-Speech (TTS)
Convert text to natural-sounding speech.Basic Usage
Available Voices
Standard voices (all models):| Voice | Style |
|---|---|
alloy | Neutral, balanced |
echo | Warm, conversational |
fable | Expressive, storytelling |
onyx | Deep, authoritative |
nova | Friendly, upbeat |
shimmer | Soft, gentle |
kokoro-82m model):
| Voice | Description |
|---|---|
af_bella, af_sarah | Female American |
am_adam, am_michael | Male American |
bf_emma, bf_isabella | Female British |
bm_george, bm_lewis | Male British |
Speed Control
Adjust playback speed from 0.25x to 4.0x:Audio Formats
| Format | Use Case |
|---|---|
mp3 | General purpose, widely compatible |
opus | Efficient streaming, low bandwidth |
aac | Apple ecosystem |
flac | Lossless, archival |
wav | Uncompressed, editing |
pcm | Raw audio, processing |
Speech-to-Text (STT)
Transcribe audio to text.Basic Transcription
Verbose Output with Timestamps
Get word-level timestamps:Subtitle Formats
Generate subtitles directly:Translation
Translate audio from any language to English:Real-Time Voice Chat
Combine TTS and STT for voice conversations:Supported Audio Formats
For transcription and translation:- MP3
- WAV
- M4A
- OGG
- FLAC
- WebM
