Docs
Prosody scores pronunciation from audio and known reference text.
Quickstart
Try the playground without a key, or call the API with an API key.
curl -X POST https://api.prosody.studio/v1/scores \
-H "X-API-Key: $PROSODY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"audio_data": "<base64 audio>",
"sample_rate": 16000,
"language": "en-US",
"reference_text": "The quick brown fox"
}'
Authentication
Send your key in the X-API-Key header. The playground
has a small no-key trial for evaluation.
POST /v1/scores
Score one recording against one reference text.
| Field | Required | Description |
|---|---|---|
audio_data |
Yes | Base64-encoded audio. |
sample_rate |
Yes | Audio sample rate in Hz. |
language |
Yes | Use en-US today. |
reference_text |
Yes | The text the speaker was expected to say. |
Response
The response includes aggregate scores and per-word detail. Full responses may also include phoneme timing, insertions, warnings, and mismatch diagnostics.
{
"scores": {
"pronunciation": 72.4,
"script_adherence": 100.0,
"overall": 72.4
},
"words": [
{
"word": "the",
"status": "match",
"acoustic_match": 68.1,
"timing": { "start": 0.12, "end": 0.24, "duration_ms": 120 },
"phonemes": [
{ "detected": "DH", "acoustic_match": 71.2 }
]
}
]
}
TypeScript SDK
Use @prosody/sdk if you want typed requests and
response validation.
npm install @prosody/sdk
Privacy
- Audio is processed in memory and discarded after scoring.
- Audio is not sent to third-party speech APIs.
- Submitted audio and text are not used for training.
Request an API key
Email [email protected] with what you are building and roughly how much audio you expect to score.