SPEECH-TO-TEXT ROUTING API

Transcription routing, normalized.

One API for production speech-to-text across providers, features, and pricing rules.

Create API key View models

Provider fallback & load balancing
Feature-aware routing
Normalized stt.v1 output

audio.wav

Route

Deepgram

p95 8.4s

AssemblyAI

p95 9.1s

Gladia

p95 7.9s

response.jsonstt.v1

{
  "text": "The quick brown fox...",
  "language": "en",
  "segments": [
    { "speaker": "speaker_0", "start": 0.00 }
  ]
}

Popular STT
Models

Deepgram Nova-3

$0.0043 / min·p95 8.4s

AssemblyAI Best

$0.0026 / min·p95 9.1s

Gladia Standard

$0.0039 / min·p95 7.9s

View all models

Developer experience

One request. One schema. Any provider.

Submit audio with your model and feature preferences. AudioRouter handles provider-specific options and returns a normalized stt.v1 transcript every time.

Managed or BYOK: Use credits or bring your own provider keys with a transparent routing fee.
Feature matrix: Diarization, word timestamps, language detection — priced and resolved per model.
Webhooks: Signed delivery with your team webhook secret when jobs complete.

create-transcription.sh

$ curl -X POST https://api.audiorouter.dev/v1/transcriptions \
  -H "Authorization: Bearer ar_..." \
  -F "model=deepgram/nova-3" \
  -F "audio_url=https://example.com/call.wav" \
  -F "features[]=diarization" \
  -F "features[]=word_timestamps"

stt.v1 responsenormalized

{
  "text": "Hello there. How are you doing today?",
  "language": {
    "detected": "en",
    "confidence": 0.98
  },
  "segments": [
    {
      "text": "Hello there.",
      "speaker": "speaker_0",
      "start": 0.12,
      "end": 1.80,
      "words": [ ... ]
    }
  ]
}

Start transcribing in minutes

Create an API key, pick a model, and send your first job. Pay only for what you use.

Create API key Read the docs