Submit feedback on
Suboptimal Azure OpenAI Model Type
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Suboptimal Azure OpenAI Model Type
Ariel Lichterman
CER:
Azure-AI-4754
Service Category
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Outdated Model Selection
Explanation

Azure releases newer OpenAI models that provide better performance and cost characteristics compared to older generations. When workloads remain on outdated model versions, they may consume more tokens to produce equivalent output, run slower, or miss out on quality improvements. Because customers pay per token, using an older model can lead to unnecessary spending and reduced value. Aligning deployments to the most current, efficient model types helps reduce spend and improve application performance.

Relevant Billing Model

On-demand Azure OpenAI deployments are billed per input and output token. Newer models often offer lower cost per processed token, higher throughput, and reduced latency. Continuing to run older models can increase token usage and degrade cost efficiency.

Detection
  • Review Azure OpenAI deployments to identify workloads using older or deprecated model versions
  • Assess token consumption patterns to determine whether newer models could achieve the same results more efficiently
  • Evaluate latency or performance issues that may be linked to older model behavior
  • Check Azure’s model lifecycle and release notes to confirm whether a newer recommended model family exists
Remediation
  • Migrate workloads to the latest suitable Azure OpenAI model that provides improved efficiency and performance
  • Establish a periodic review process to ensure deployed models are aligned with current Azure model offerings
  • Incorporate model lifecycle awareness into architecture standards so workloads are upgraded as new versions become available
  • Validate compatibility and output quality after migration to ensure a smooth transition to newer models
Submit Feedback