Pronunciation API overview
Send audio and reference text. Get structured pronunciation results.
What Prosody does
Prosody scores spoken audio against known text. It returns per-word pronunciation scores, phoneme detail, timing, and script adherence.
It is for read-aloud, assessment, coaching, QA, and other workflows where the expected utterance is known. It is not a generic speech-to-text API.
API shape
-
Endpoint:
POST /v1/scores - Input: base64 audio, sample rate, language, reference text
- Output: overall score, word list, phonemes, timings, warnings
-
Base URL:
https://api.prosody.studio
Access
The playground works without a key for a small daily trial. For API access, email [email protected].