Pricing
Usage-based pricing for guided speech workflows. Start free, then pay per audio-minute scored.
Developer
- Pay per audio-minute
- API key access
- Batch + single endpoints
- Full scoring pipeline
- TypeScript SDK available
Growth
- Volume pricing
- Priority support
- Dedicated infrastructure + warm capacity
- SLA available
- SDK support + roadmap access
Cost breakdown
Billed per audio-minute scored. A typical 30-second session costs $0.002.
| Volume (sessions/month) | 30s avg session | Monthly | Annual |
|---|---|---|---|
| 1,000 | $0.002 | $2 | $24 |
| 10,000 | $0.002 | $20 | $240 |
| 100,000 | $0.002 | $200 | $2,400 |
| 1,000,000 | $0.002 | $2,000 | $24,000 |
Billed in 1-second increments. Minimum charge: 1 second per request. Volume pricing available for Growth tier.
Pricing context
We include Azure Speech as a reference point because it is the most common pronunciation assessment API that teams evaluate against. Azure pricing structures change over time — verify current rates on their pricing page before making decisions.
| Prosody | Azure Speech (as of Q1 2026) | |
|---|---|---|
| Published rate | $0.004/min | ~$0.022/min list price |
| Per 30s session | $0.002 | ~$0.011 |
| 100K sessions/mo (annual) | $2,400/yr | ~$13,200/yr |
| Integration posture | REST + SDK | Azure SDK workflow |
| Audio leaves your stack | No | Yes |
| Vendor lock-in | None | Azure |
Azure rate derived from $1.32/hr standard real-time pronunciation assessment (last verified Q1 2026). Azure also lists $0.66/hr for short-audio batch. Pricing structures vary by region, commitment tier, and product bundle — always check their current page. We use Azure as a reference because teams already buy it, not because it is the only alternative.
Questions
What is Prosody priced for?
Guided speech workflows: scripted assessment, pronunciation coaching, QA, and product flows where you already know the expected utterance.
Does my audio leave your infrastructure?
No. Audio is processed in-memory within Prosody infrastructure and discarded after scoring. No third-party API calls, no storage, no training on your data.
What audio formats are supported?
WAV, WebM, MP3, FLAC, and OGG. All audio is converted to 16kHz mono internally. See the API docs for details.
How is audio-time calculated?
You are billed for the duration of audio submitted, in 1-second increments. A 30-second recording costs $0.002. A 10-second recording costs $0.00067.
Does Developer pricing include streaming?
No. Developer pricing covers batch and single-request scoring. Streaming, warm capacity, and SLA-backed production support are handled through Growth plans.
Do you offer enterprise SLAs?
Yes. Contact us for volume pricing, uptime guarantees, dedicated support, or a dedicated deployment footprint.