SPEECH-TO-TEXT ROUTING API

Transcription routing, normalized.

One API for production speech-to-text across providers, features, and pricing rules.

  • Provider fallback & load balancing
  • Feature-aware routing
  • Normalized stt.v1 output
audio.wav
Route

Deepgram

p95 8.4s

AssemblyAI

p95 9.1s

Gladia

p95 7.9s

response.jsonstt.v1
{
  "text": "The quick brown fox...",
  "language": "en",
  "segments": [
    { "speaker": "speaker_0", "start": 0.00 }
  ]
}

Developer experience

One request. One schema. Any provider.

Submit audio with your model and feature preferences. AudioRouter handles provider-specific options and returns a normalized stt.v1 transcript every time.

Managed or BYOK
Use credits or bring your own provider keys with a transparent routing fee.
Feature matrix
Diarization, word timestamps, language detection — priced and resolved per model.
Webhooks
Signed delivery with your team webhook secret when jobs complete.
create-transcription.sh
$ curl -X POST https://api.audiorouter.dev/v1/transcriptions \
  -H "Authorization: Bearer ar_..." \
  -F "model=deepgram/nova-3" \
  -F "audio_url=https://example.com/call.wav" \
  -F "features[]=diarization" \
  -F "features[]=word_timestamps"
stt.v1 responsenormalized
{
  "text": "Hello there. How are you doing today?",
  "language": {
    "detected": "en",
    "confidence": 0.98
  },
  "segments": [
    {
      "text": "Hello there.",
      "speaker": "speaker_0",
      "start": 0.12,
      "end": 1.80,
      "words": [ ... ]
    }
  ]
}

Start transcribing in minutes

Create an API key, pick a model, and send your first job. Pay only for what you use.