Cloud Provider
Service Name
Inefficiency Type
Clear filters
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Showing
1234
out of
1234
inefficiencies
Filter
:
Filter
x
Missing Reserved PTUs for Steady-State Azure OpenAI Workloads
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Unoptimized Pricing Model

Many production Azure OpenAI workloads—such as chatbots, inference services, and retrieval-augmented generation (RAG) pipelines—use PTUs consistently throughout the day. When usage stabilizes after initial experimentation, continuing to rely on on-demand PTUs results in ongoing unnecessary spend. These workloads are strong candidates for reserved PTUs, which provide identical performance guarantees at a substantially reduced hourly rate. Migrating to reservations usually requires no architectural changes and delivers immediate cost savings.

Suboptimal Azure OpenAI Model Type
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Outdated Model Selection

Azure releases newer OpenAI models that provide better performance and cost characteristics compared to older generations. When workloads remain on outdated model versions, they may consume more tokens to produce equivalent output, run slower, or miss out on quality improvements. Because customers pay per token, using an older model can lead to unnecessary spending and reduced value. Aligning deployments to the most current, efficient model types helps reduce spend and improve application performance.

Using High-Cost Models for Low-Complexity Tasks
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Overpowered Model Selection

Some workloads — such as text classification, keyword extraction, intent detection, routing, or lightweight summarization — do not require the capabilities of the most advanced model families. When high-cost models are used for these simple tasks, organizations pay elevated token rates for work that could be handled effectively by more efficient, lower-cost models. This mismatch typically arises from defaulting to a single model for all tasks or not periodically reviewing model usage patterns across applications.

Provisioned Throughput OpenAI Deployment in Non-Production Environments
AI
Cloud Provider
Azure
Service Name
Azure Cognitive Services
Inefficiency Type
Overprovisioned Deployment Model

PTU deployments guarantee dedicated throughput and low latency, but they also require paying for reserved capacity at all times. In non-production environments—such as dev, test, QA, or experimentation—usage patterns are typically sporadic and unpredictable. Deploying PTUs in these environments leads to consistent baseline spend without corresponding value. On-demand deployments scale usage cost with actual consumption, making them more cost-efficient for variable workloads.

Suboptimal Use of Serverless Compute for Azure SQL Database
Databases
Cloud Provider
Azure
Service Name
Azure SQL
Inefficiency Type
Incorrect Compute Tier Selection

Serverless is attractive for variable or idle workloads, but it can become more expensive than Provisioned compute when database activity is high for long portions of the day. As active time increases, per-second compute accumulation approaches—or exceeds—the fixed monthly cost of a Provisioned tier. This inefficiency arises when teams adopt Serverless as a default without assessing workload patterns. Databases with steady demand, predictable traffic, or long active periods often operate more cost-effectively on Provisioned compute. The economic break-even point depends on workload activity, and when that threshold is consistently exceeded, Provisioned becomes the more efficient option.

Suboptimal Use of Provisioned Compute for Azure SQL Database
Databases
Cloud Provider
Azure
Service Name
Azure SQL
Inefficiency Type
Incorrect Compute Tier Selection

Databases deployed on Provisioned compute incur continuous hourly charges even when workload demand is low. For databases that are active only briefly within an hour, or for limited hours per month, Serverless can provide significantly lower cost because it bills only for active compute time. The economic break-even point between Provisioned and Serverless depends on workload activity patterns. If monthly active time falls *below* the conceptual break-even range, Serverless is more cost-effective. If active time regularly exceeds that range, Provisioned may be more appropriate. This inefficiency typically appears when teams default to Provisioned compute without evaluating workload behavior over time.

Suboptimal Integration Runtime Region Selection in Azure Data Factory
Compute
Cloud Provider
Azure
Service Name
Azure Data Factory V2
Inefficiency Type
Cross-Region Data Movement

When Integration Runtimes are configured with the default “Auto Resolve” region setting, Azure may automatically provision them in a region different from the data sources or sinks. For example, an environment deployed in West Europe may run pipelines in US East. This causes unnecessary cross-region data transfer, increasing networking costs and pipeline latency. The inefficiency often goes unnoticed because data transfer costs are billed separately from pipeline compute charges.

Outdated AWS Glue Version for Python Jobs
Compute
Cloud Provider
AWS
Service Name
AWS Glue
Inefficiency Type
Outdated Runtime Version

Newer AWS Glue versions—such as Glue 5.0—include significant performance optimizations for **Python-based** ETL jobs, often reducing runtime by 10–60%. These improvements do not require any code changes, making version upgrades a simple and impactful optimization. When jobs remain on older runtimes such as Glue 3.0 or 4.0, they execute more slowly, consume more DPUs, and incur unnecessary cost. Additionally, Glue 5.0 offers more worker types (larger standard workers and memory-optimized workers), that can provide additional performance gain for some jobs. This inefficiency does not apply to Scala-based jobs, which do not benefit from the same performance uplift.

Suboptimal Storage for Logs
Other
Cloud Provider
GCP
Service Name
GCP Cloud Logging
Inefficiency Type
Misaligned Storage Destination

Many organizations retain all logs in Cloud Logging’s standard storage, even when the data is rarely queried or required only for audit or compliance. Logging buckets are priced for active access and are not optimized for low-frequency retrievas, results in unnecessary expense. Redirecting logs to BigQuery or Cloud Storage can provide better cost efficiency, particularly when coupled with lifecycle policies or table partitioning. Choosing the optimal storage destination based on access frequency and analytics needs is essential to control log retention costs.

Resources Generating Excessive INFO Logs
Other
Cloud Provider
GCP
Service Name
GCP Cloud Logging
Inefficiency Type
Excessive Log Verbosity

Some GCP services and workloads generate INFO-level logs at very high frequencies — for example, load balancers logging every HTTP request or GKE nodes logging system health messages. While valuable for debugging, these logs can flood Cloud Logging with non-critical data. Without log-level tuning or exclusion filters, organizations incur continuous ingestion charges for messages that are seldom analyzed. Over time, this behavior compounds into a persistent waste driver across large-scale environments.

There are no inefficiency matches the current filters.