Overview — Prosody

What Prosody does

Prosody scores spoken audio against known text. It returns per-word pronunciation scores, phoneme detail, timing, and script adherence.

It is for read-aloud, assessment, coaching, QA, and other workflows where the expected utterance is known. It is not a generic speech-to-text API.

API shape

Endpoint: POST /v1/scores
Input: base64 audio, sample rate, language, reference text
Output: overall score, word list, phonemes, timings, warnings
Base URL: https://api.prosody.studio

Access

The playground works without a key for a small daily trial. For API access, email [email protected].