Skip to main content

Speech-to-Text

Transcribe audio files to text, with optional translation to English.

Transcribe Audio

POST /v1/audio/transcriptions
Transcribe audio to text in the original language.

Request Body (multipart/form-data)

ParameterTypeRequiredDefaultDescription
filefileYesAudio file (mp3, wav, m4a, ogg, flac, webm)
modelstringYesSTT model (see below)
languagestringNoISO-639-1 language code (e.g., en, fr)
promptstringNoContext hint to guide transcription
response_formatstringNojsonOutput format
temperaturenumberNo0Sampling temperature (0.0-1.0)

Available Models

ModelDescription
whisper-1OpenAI Whisper compatible
whisper-large-v3-turboFast, accurate
stt-1b-en_frEnglish/French optimized

Response Formats

FormatDescription
jsonSimple JSON with text
textPlain text
srtSubRip subtitle format
vttWebVTT subtitle format
verbose_jsonJSON with word timestamps

Example Request

curl -X POST https://api.ymq-01.edge.polargrid.ai:55111/v1/audio/transcriptions \
  -H "Authorization: Bearer pg_your_api_key" \
  -F "file=@recording.mp3" \
  -F "model=whisper-1" \
  -F "language=en" \
  -F "response_format=json"

Response (json)

{
  "text": "Hello, this is a test recording."
}

Response (verbose_json)

{
  "task": "transcribe",
  "language": "en",
  "duration": 5.2,
  "text": "Hello, this is a test recording.",
  "segments": [
    {
      "id": 0,
      "seek": 0,
      "start": 0.0,
      "end": 2.5,
      "text": "Hello, this is",
      "tokens": [1234, 5678],
      "temperature": 0.0,
      "avg_logprob": -0.25,
      "compression_ratio": 1.2,
      "no_speech_prob": 0.01
    },
    {
      "id": 1,
      "seek": 250,
      "start": 2.5,
      "end": 5.2,
      "text": "a test recording.",
      "tokens": [9012, 3456],
      "temperature": 0.0,
      "avg_logprob": -0.3,
      "compression_ratio": 1.1,
      "no_speech_prob": 0.02
    }
  ]
}

Translate Audio

POST /v1/audio/translations
Translate audio to English text (from any supported language).

Example

const translation = await client.translate({
  file: spanishAudioFile,
  model: 'whisper-1',
  responseFormat: 'json',
});

console.log(translation.text); // English text

Supported Audio Formats

  • MP3
  • WAV
  • M4A
  • OGG
  • FLAC
  • WebM