Submit feedback on
Missing Reserved PTUs for Steady-State Azure OpenAI Workloads
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Missing Reserved PTUs for Steady-State Azure OpenAI Workloads
Ariel Lichterman
CER:
Azure-AI-2318
Service Category
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Unoptimized Pricing Model
Explanation

Many production Azure OpenAI workloads—such as chatbots, inference services, and retrieval-augmented generation (RAG) pipelines—use PTUs consistently throughout the day. When usage stabilizes after initial experimentation, continuing to rely on on-demand PTUs results in ongoing unnecessary spend. These workloads are strong candidates for reserved PTUs, which provide identical performance guarantees at a substantially reduced hourly rate. Migrating to reservations usually requires no architectural changes and delivers immediate cost savings.

Relevant Billing Model

PTUs are billed hourly based on provisioned throughput. On-demand PTUs use standard hourly rates, whereas reserved PTUs offer significant discounts—often up to \~80%—when capacity is committed for a month or year. Workloads running continuously on on-demand PTUs incur avoidable premium pricing.

Detection
  • Review PTU deployments supporting production workloads that operate continuously throughout the day
  • Assess whether throughput demand remains stable enough to justify reserved capacity
  • Identify deployments that have moved beyond experimentation but still use on-demand PTUs
  • Evaluate the cost difference between on-demand PTUs and reserved PTUs for these workloads
Remediation
  • Purchase monthly or annual reserved PTUs for workloads with sustained, predictable throughput needs
  • Establish governance criteria defining when production PTU deployments should transition to reservations
  • Periodically reassess workload stability to ensure PTU reservation commitments remain aligned with demand
  • Use cost modeling to evaluate reservation options as part of production readiness reviews
Submit Feedback