Cloud Cost Inefficiencies Database | AWS, Azure, GCP Waste

1

of

3

Underutilized Azure Managed Disk Reservations

Jurian van Hoorn

Storage

Cloud Provider

Azure

Service Name

Azure Managed Disks

Inefficiency Type

Suboptimal Pricing Model

Azure Managed Disk reservations allow organizations to pre-purchase Premium SSD capacity at a discounted rate by committing to a one-year term. However, these reservations operate on a strict use-it-or-lose-it basis — if reserved disk capacity is not matched by provisioned disks in a given hour, that hour's reservation benefit is permanently lost and does not carry forward. This means that any mismatch between reserved quantities and actual disk deployment directly erodes the value of the commitment. Organizations commonly encounter this waste when workloads are decommissioned, migrated to different disk SKUs, or moved to different regions after the reservation was purchased.

A critical nuance is that disk reservations are purchased by specific SKU (such as P30 or P40), not by aggregate capacity. A P40 reservation cannot be applied to P30 disk usage, even though both are Premium SSDs. This SKU-level rigidity creates a significant mismatch risk: if an organization resizes disks or shifts workloads to a different tier, the original reservation provides zero benefit. Combined with the relatively modest discount that disk reservations offer compared to other Azure reservation types, even a small amount of underutilization can quickly eliminate any savings and turn the reservation into a net cost increase.

The cost impact compounds over time. Because unused reservation hours are permanently lost, an organization paying for reservations that consistently go partially or fully unused is effectively paying more than it would under standard pay-as-you-go pricing — the worst possible outcome for a commitment designed to save money.

learn more

Oversized Microsoft Fabric Capacity During Low-Utilization Periods

Other

Cloud Provider

Azure

Service Name

Microsoft Fabric

Inefficiency Type

Overprovisioned Resource

Microsoft Fabric capacity is billed based on the provisioned SKU tier (F2, F4, F8, F16, F64, etc.) on an hourly basis, regardless of whether workloads are actively consuming those resources. Each SKU tier provides a fixed pool of Capacity Units (CUs) representing bundled CPU and memory. When a capacity remains at a higher tier during periods of low or zero utilization—such as overnight, weekends, or light-query business hours—the organization pays for idle compute that delivers no value. Because billing begins immediately upon provisioning and continues as long as the capacity is running, even brief periods of over-provisioning accumulate unnecessary charges.

This pattern is especially common in development and test environments, organizations with predictable business-hours-only workloads, or teams that scale up for peak processing periods but neglect to scale back down afterward. Unlike some Azure services, Microsoft Fabric does not offer native autoscale for F-SKUs, meaning capacity adjustments must be performed manually or through custom automation. Without deliberate scheduling, capacity tends to drift upward and stay there. The financial impact scales with SKU size the difference between a small and a large SKU in a single region can represent thousands of dollars per month in avoidable spend.

It is important to note that this optimization primarily benefits organizations on pay-as-you-go pricing. Reserved capacity customers pay for their commitment regardless of usage or pause state, so scaling down or pausing does not reduce their bill below the reserved tier. However, reserved capacity customers who scale above their committed SKU still incur additional pay-as-you-go charges for the overage, making right-sizing relevant even in reserved scenarios.

learn more

Excessive API Request Cost Overhead from I/O-Intensive Workloads

Ariel Fishman-Lichterman

Storage

Cloud Provider

GCP

Service Name

GCP GCS

Inefficiency Type

Inefficient Architecture

Google Cloud Storage exposes a flat namespace through an HTTP API, but many workloads consume it through filesystem-style abstractions — FUSE mounts, Hadoop/Spark connectors, or analytic engines that enumerate prefixes to discover state. This mismatch creates a hidden cost multiplier: every directory listing translates into list-object API calls, every metadata check becomes a HEAD request, and every file rename becomes a copy-then-delete sequence. Each of these is individually metered as a Class A or Class B operation, and the charges accumulate rapidly when the access pattern is I/O-intensive. Because the filesystem abstraction hides the per-call billing model from the application, developers often have no visibility into the volume of paid operations their code generates.

The problem manifests on both the read and write sides. On the read and coordination side, applications enumerate prefixes to discover partitions, issue per-object metadata calls on hot paths, and poll prefixes on timers instead of subscribing to notifications — all generating high volumes of list and metadata operations. On the write side, ingest paths that create one object per record (metrics, logs, events) produce a flood of insert operations instead of fewer, larger uploads carrying the same bytes. Commit workflows that use rename-by-copy-delete further multiply operation counts in proportion to the number of output files rather than the number of logical commits.

The cost impact can be substantial. In workloads generating millions of small objects or frequent list operations, operation charges can rival or exceed the underlying storage costs — a clear signal that the workload's contract with object storage needs to change. The well-architected pattern is to push state management, ordering, and batching into the application layer so that Cloud Storage handles bytes, not filesystem semantics.

learn more

Excessive Transaction Cost Overhead on Blob Storage from I/O-Intensive Workloads

Ariel Fishman-Lichterman

Storage

Cloud Provider

Azure

Service Name

Azure Blob Storage

Inefficiency Type

Inefficient Architecture

Azure Blob Storage and ADLS Gen2 bill per transaction — every list, read, write, rename, and metadata operation is a separately metered API call. When organizations migrate workloads from on-premises Hadoop/HDFS environments or local filesystems, the ADLS Gen2 hierarchical namespace and its filesystem-like API make the transition feel seamless. But this abstraction masks a fundamental shift: what was a local or cluster-internal filesystem call is now a billed HTTP transaction. Applications that port their filesystem habits — recursive directory listings to discover state, per-file existence checks on hot paths, rename-based commit protocols, and per-record writes from telemetry pipelines — generate transaction volumes that can rival or exceed the cost of storing the data itself.

The problem is especially acute in big data analytics workloads. Spark and Hive jobs using legacy commit protocols issue large numbers of list and metadata operations at commit time, scaling with the number of output files rather than the number of logical commits. Telemetry, log, and event-ingest pipelines that write one blob per record create a parallel storm on the write side. Meanwhile, consumers that poll containers on a timer to detect new data add further list operations. The hierarchical namespace makes directory renames atomic — a genuine improvement over flat blob storage — but it does not make discovery free, and it does nothing to reduce the cost of unbatched writes. Transaction costs for hierarchical namespace accounts also carry an uplift compared to flat namespace accounts, compounding the expense.

The well-architected pattern is to own the metadata and the batching in the application layer — through table formats, manifests, metastores, or event-driven architectures — so the storage account serves bytes, not state queries and per-record overhead. Without this shift, transaction costs can quietly become the dominant line item on a storage account bill.

learn more

Excessive S3 Request Costs from Filesystem-Oriented I/O Patterns

Ariel Lichterman

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Inefficient Architecture

Amazon S3 bills for every API request — LIST, HEAD, GET, PUT, COPY, POST, and DELETE — independently of storage charges. Workloads originally designed for locally-provisioned storage, where listing a directory, checking a file's existence, or writing a single record is effectively free, carry those assumptions into S3 and convert each step into a billed HTTP request. At scale, request costs can rival or even exceed storage costs, yet they are routinely overlooked by cost-optimization efforts that focus on storage class selection and data transfer.

The waste manifests on both sides of the I/O path. On the read and coordination side, applications generate LIST and HEAD storms: legacy commit protocols recursively list output directories to discover what tasks wrote, query engines re-enumerate partitions on every execution, and consumers poll a prefix on a timer to detect new data. On the write side, metrics, event, and log pipelines issue one small PUT per record instead of buffering and flushing in batches, so PUT volume scales linearly with input rate. Rename-based commit protocols compound the problem because S3 has no native rename — each rename is implemented as a COPY followed by a DELETE, doubling the request count per output file.

The root cause is an architectural mismatch: the application treats S3 as a filesystem it can list, stat, and rename cheaply, when S3 is an object store that charges per HTTP call. Fixing the problem requires shifting coordination, state tracking, and batching into the application layer so that S3 serves bytes rather than acting as a coordination mechanism.

learn more

Idle SageMaker Notebook Instances Left Running Continuously

AI

Cloud Provider

AWS

Service Name

AWS SageMaker

Inefficiency Type

Unused Resource

SageMaker notebook instances are billed continuously while in an active state — and critically, they do not automatically shut down when idle. Closing a browser tab, shutting down a Jupyter kernel, or simply walking away does not stop the underlying compute instance. This creates a pervasive waste pattern in ML and data science teams: a developer spins up a powerful GPU instance for experimentation, finishes their work, closes the browser, and assumes the resource is no longer running. In reality, the instance continues accruing per-second charges around the clock until it is explicitly stopped.

This is particularly costly because ML workloads often require high-performance instance types with GPUs. A single forgotten GPU notebook instance can generate thousands of dollars in monthly charges with zero productive use. The problem is compounded in team environments where multiple data scientists each maintain their own notebook instances, and there is no organizational process for reviewing or reclaiming idle resources. The classic scenario — an instance left running over a weekend or holiday — is one of the most common and avoidable sources of ML infrastructure waste.

Unlike SageMaker Studio, which offers native automatic shutdown of idle applications, traditional notebook instances have no built-in idle detection or auto-stop capability. Without explicit lifecycle configuration scripts or external automation, these instances will run indefinitely. The user experience itself is deceptive: the act of closing a notebook feels like shutting down, but the billable compute continues silently in the background.

learn more

Missing Partition Pruning in Delta Lake Table Queries

Benjamin van der Maas

Databases

Cloud Provider

AWS

Service Name

Databricks

Inefficiency Type

Inefficient Configuration

When Delta Lake tables are partitioned by specific columns — such as date, region, or tenant identifier — the query engine can use partition pruning to limit data scans to only the relevant subset of files. However, when queries against these partitioned tables omit filter predicates on partition columns, the engine is forced to perform a full table scan across all partitions. This means the cluster reads every data file in the table regardless of how much data the query actually needs, directly inflating both execution time and Databricks Unit (DBU) consumption.

This pattern is especially common in several scenarios: legacy SQL queries written before tables were partitioned, dynamically generated queries from applications or BI tools that do not incorporate partition column awareness, and ad-hoc exploratory queries by analysts unfamiliar with the table's partitioning strategy. On large time-series datasets, the difference can be dramatic — a query that should scan only a few gigabytes of recent data may instead process terabytes across the entire table history. Because Databricks bills DBUs per second, a query that runs significantly longer due to scanning unnecessary data consumes proportionally more DBUs, compounding the waste across both the Databricks platform charges and the underlying cloud infrastructure costs.

This inefficiency is distinct from tables that lack partitioning entirely. Here, the partitioning infrastructure exists and is correctly configured, but queries fail to leverage it — making the investment in partitioning effectively wasted while still incurring full-scan costs.

learn more

Excessive Data Processing Fees on High-Throughput Cloud NAT Gateways

Naga Bhanu Kiran Kota

Networking

Cloud Provider

GCP

Service Name

GCP Cloud NAT

Inefficiency Type

Inefficient Configuration

Cloud NAT charges a per-GiB data processing fee on all traffic routed through the gateway — both inbound responses and outbound requests. For high-throughput workloads such as web crawlers, data pipelines, container image pulls, and API-heavy microservices, these per-GiB charges can become the dominant cost component of the NAT gateway, far exceeding the hourly gateway and IP address fees. In environments processing large volumes of data monthly, data processing fees can represent the vast majority of total Cloud NAT spend, making the managed service significantly more expensive than alternative NAT architectures when comparing direct infrastructure costs alone.

The core issue is that Cloud NAT applies its data processing fee to traffic that would otherwise be free or low-cost — particularly inbound traffic (ingress), which Google Cloud does not normally charge for. When private instances pull large datasets, download container images, or receive high volumes of API responses through Cloud NAT, each GiB incurs the processing fee. Organizations can avoid these per-GiB charges by deploying self-managed NAT instances on Compute Engine — VMs configured with IP forwarding and NAT translation rules — where the only direct cost is the compute instance itself. However, this trade-off introduces substantial operational complexity, ongoing maintenance burden, and availability risk: self-managed NAT requires manual configuration, network expertise, continuous monitoring, security patching, high-availability planning, capacity management, incident response procedures, and troubleshooting capabilities that Cloud NAT handles automatically. The engineering time required for initial implementation, the ongoing operational labor for maintenance, and the business impact of potential service disruptions must all be factored into the total cost of ownership.

This optimization is highly workload-specific and situational rather than universally applicable or recommended. The break-even point depends not only on monthly traffic volume, the number of VMs behind the gateway, and the chosen instance type for self-managed NAT, but also on the fully-loaded cost of engineering time, the organization's operational maturity, the criticality of affected workloads, and the tolerance for increased operational risk. In most cases, the operational overhead, complexity, and risk of self-managed NAT infrastructure outweigh the direct cost savings unless data processing fees are exceptionally high and sustained over time. Organizations should perform a comprehensive total cost of ownership analysis before migrating, accounting for both direct infrastructure costs and indirect costs such as engineering effort, operational burden, monitoring infrastructure, and the business risk of connectivity failures. This is not a straightforward cost optimization — it is a deliberate trade-off between managed service convenience and operational control that only makes sense at very high traffic volumes where the cost differential is substantial enough to justify the additional complexity and risk.

learn more

Orphaned MLflow Training Artifacts and Model Checkpoints in Object Storage

Annapurna Mungara

AI

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

Machine learning experimentation workflows — particularly those managed through experiment tracking platforms — generate large volumes of artifacts in object storage. Every training run produces model checkpoints, evaluation outputs, feature snapshots, and tensor logs. Hyperparameter tuning and AutoML workflows amplify this by creating hundreds or thousands of individual runs, each writing its own set of artifacts to locations in S3. When experiments are abandoned, models are never promoted to production, or team members depart, these artifacts remain in storage indefinitely because there is no native lifecycle management for ML experiment artifacts — cleanup must be implemented manually.

The cost impact is driven entirely by object storage capacity charges, which accumulate per GB-month regardless of whether the artifacts are referenced, the experiments are active, or the models are registered. Critically, even when experiment metadata is deleted through the tracking platform, the underlying artifacts in object storage are not automatically purged — they must be removed separately. For organizations training large models, checkpoint files alone can reach hundreds of gigabytes each, and production training pipelines may checkpoint every few hours. Without retention policies, it is common for ML artifact storage costs to grow unchecked and eventually rival or exceed compute costs.

learn more

Overprovisioned or Idle Azure Container Registry Tier

naga bhanu kiran kota

Other

Cloud Provider

Azure

Service Name

AWS ECR

Inefficiency Type

Overprovisioned Resource

Azure Container Registry charges a fixed daily fee based on the selected tier — Basic, Standard, or Premium — regardless of whether the registry is actively used. This means a registry with zero image pulls, zero pushes, and no active workloads consuming it still incurs the same daily charge as a heavily utilized one. Teams commonly provision Standard or Premium tiers as a default "production-safe" choice without evaluating whether the advanced capabilities exclusive to those tiers — such as geo-replication, private endpoints, content trust, or zone redundancy — are actually needed. The result is a persistent overspend on tier fees that deliver no incremental value.

This waste pattern is especially prevalent in organizations with decentralized container workflows. Registries created for short-lived projects, development and testing environments, or CI/CD pipelines are frequently left running long after their purpose has ended. Because Azure Container Registry has no free tier and cannot be paused or stopped — deletion is the only way to cease billing — these forgotten registries quietly accumulate fixed charges indefinitely. Across an organization with dozens of registries spread across teams and environments, the compounding effect of idle or over-tiered registries can represent a meaningful and entirely avoidable cost.

learn more

Infrequently Accessed Data Retained on High-Performance FSx File Systems

Amay Chandravanshi

Storage

Cloud Provider

AWS

Service Name

Amazon FSx

Inefficiency Type

Inefficient Configuration

Amazon FSx file systems are designed for performance-sensitive workloads such as shared enterprise file systems, high-performance computing, analytics, and machine learning. Storage costs are driven by provisioned capacity (measured in GB-months) and throughput capacity (measured in MBps-months), regardless of how frequently the stored data is actually accessed. When datasets become archival, historical, or reference-only in nature — often after project completion, workload migration, or data lifecycle changes — retaining them on high-performance FSx storage results in sustained premium charges for data that could reside on significantly cheaper alternatives.

The severity of this inefficiency varies by FSx variant. FSx for Windows File Server is most directly exposed because it lacks native automatic tiering to external cold storage tiers — all data remains on provisioned SSD or HDD capacity with no built-in mechanism to move cold data to lower-cost object storage. FSx for NetApp ONTAP, by contrast, offers automatic data tiering to a lower-cost capacity pool tier, but this feature must be properly configured with appropriate tiering policies per volume; if left at default settings or misconfigured, cold data may still occupy expensive SSD storage. FSx for Lustre and FSx for OpenZFS support tiering storage classes that automatically move data between access tiers, but only when this storage class is selected at deployment. In all cases, the waste stems from the same root cause: high-performance storage capacity being consumed by data that no longer requires — or never required — that level of performance.

learn more

Spot Instance Overreliance Without Effective Cost-Per-Performance Analysis

Compute

Cloud Provider

AWS

Service Name

AWS EC2

Inefficiency Type

Inefficient Configuration

Organizations frequently pursue aggressive Spot Instance adoption based on headline discount percentages — up to 90% off On-Demand pricing — without evaluating the effective cost per unit of work completed. While Spot pricing can deliver significant savings for well-suited workloads, the actual blended cost of a Spot-heavy architecture is often higher than the headline discount suggests. Interruption handling requires fault-tolerant design, automated replacement mechanisms, checkpointing, and fallback capacity strategies — all of which add operational overhead and can erode the effective savings. When fallback instances run at On-Demand rates during capacity reclamation events, the blended hourly cost across the fleet rises substantially above the Spot rate alone.

This pattern is compounded when Spot fleets rely on older-generation instance types. AWS releases new instance generations regularly, and newer generations typically deliver meaningfully better performance per dollar at similar or lower hourly rates. For example, ARM-based processor instances can deliver up to 40% better price-performance compared to equivalent x86-based instances. An organization running older-generation Spot Instances may achieve a high discount percentage relative to On-Demand but still pay more per unit of actual compute work than it would on current-generation instances covered by a Savings Plan commitment. The result is a fleet that appears cost-optimized by discount rate but is inefficient by the more meaningful measure of cost per transaction, request, or compute cycle.

This inefficiency reflects a FinOps maturity gap where rate optimization (lower per-unit price) is pursued without balancing it against usage optimization (fewer units needed for the same work). Teams that measure success by "percentage of workloads on Spot" rather than "effective cost per unit of work" are particularly susceptible. A holistic purchasing strategy that considers instance generation, workload stability, interruption tolerance, and total cost of ownership — including operational overhead — often delivers more predictable and competitive cost efficiency than maximizing Spot coverage alone.

learn more

Continuous Backup Enabled on Non-Production MongoDB Atlas Clusters

Hierony Manurung

Databases

Cloud Provider

Service Name

MongoDB Atlas

Inefficiency Type

Inefficient Configuration

MongoDB Atlas offers two backup mechanisms for dedicated clusters: Cloud Backups (scheduled snapshots using the underlying cloud provider's native snapshot functionality) and Continuous Cloud Backup, which adds point-in-time recovery by continuously capturing the cluster's oplog — a log of all write operations. Continuous Cloud Backup is an optional add-on for M10+ dedicated clusters that stores both snapshots and oplog data, enabling restoration to any specific second within a configurable restore window. While this capability is critical for production workloads with strict Recovery Point Objectives (RPOs), it provides limited value on development, testing, or staging clusters where data is typically transient, synthetic, or easily reproducible.

This inefficiency commonly arises when organizations apply infrastructure-as-code templates or centralized backup policies uniformly across all environments without differentiating between production and non-production recovery requirements. Because Continuous Cloud Backup continuously captures and stores oplog data in object storage, storage charges accumulate based on both the configured restore window and the volume of write activity on the cluster. Clusters with moderate to high write throughput generate proportionally larger oplogs, amplifying the cost impact. MongoDB's own architecture guidance explicitly recommends against enabling backup for development and test environments, recognizing that the cost of continuous oplog storage rarely justifies the recovery benefit for non-critical workloads.

learn more

Excessive NAT Gateway Data Processing Charges from Unoptimized Traffic Routing

Networking

Cloud Provider

AWS

Service Name

AWS NAT Gateway

Inefficiency Type

Inefficient Architecture

NAT Gateway charges a per-gigabyte data processing fee on all traffic that passes through it — in either direction — regardless of whether the destination is the public internet or another AWS service in the same region. This per-GB charge is separate from and additive to the hourly provisioning cost, and for workloads with meaningful throughput, it quickly becomes the dominant cost component. In many US regions, the data processing charge matches the hourly rate (e.g., $0.045/GB in US East Ohio), meaning that once monthly traffic exceeds roughly 720 GB, data processing costs surpass the baseline hourly charges entirely. For internet-bound traffic, a compounding effect occurs: hourly provisioning, per-GB data processing, and standard data transfer out charges all apply simultaneously — creating a combined variable cost that can reach $0.135 per GB or more.

This cost structure is frequently underestimated during architecture planning. Teams designing VPC layouts often account for the hourly cost of NAT Gateways but overlook how significantly the per-GB processing fee scales with traffic volume. The result is that workloads routing high-throughput traffic to AWS services like S3, DynamoDB, container registries, or logging endpoints through NAT Gateway incur substantial and avoidable data processing charges. Gateway VPC endpoints for S3 and DynamoDB carry no hourly or data processing charges at all, and interface VPC endpoints for other AWS services process data at a fraction of the NAT Gateway rate. Without deliberate traffic routing decisions, NAT Gateway data processing can quietly become one of the largest line items on an AWS bill.

learn more

Excessive On-Demand Compute Spend Due to Low Savings Plan and Reserved Instance Coverage

Jason DiDomenico

Compute

Cloud Provider

AWS

Service Name

AWS EC2

Inefficiency Type

Suboptimal Pricing Model

AWS compute services charge the full published On-Demand rate when no commitment-based discount — such as a Savings Plan (SP) or Reserved Instance (RI) — is in effect. On-Demand pricing provides maximum flexibility, but it is also the most expensive way to run workloads that have stable, predictable usage patterns. When an organization runs a large share of its steady-state compute at On-Demand rates instead of covering that baseline with SPs or RIs, it is effectively paying a premium for capacity it could have committed to at a materially lower cost.

This inefficiency is one of the most common and impactful cost optimization gaps in AWS environments. It typically arises from a lack of commitment ownership, insufficient workload analysis to identify stable baselines, organizational silos that limit visibility into aggregate usage patterns, or hesitation around long-term contracts. The cost impact scales directly with compute spend — organizations with significant monthly compute bills can leave substantial savings on the table by failing to commit their predictable baseline. Two key dimensions define the gap: coverage (what percentage of eligible usage is protected by commitments) and utilization (whether purchased commitments are being fully consumed).

Compute Savings Plans commit to a consistent dollar-per-hour spend and automatically apply across EC2 (any instance family, size, region, OS, or tenancy), Fargate, and Lambda usage. EC2 Instance Savings Plans also commit to a dollar-per-hour spend but are scoped to a specific instance family within a chosen region, offering deeper discounts in exchange for reduced flexibility while still allowing changes across sizes, operating systems, and tenancy within that family. Reserved Instances commit to specific EC2 instance configurations. Standard Reserved Instances provide the highest discounts but cannot be exchanged; Convertible Reserved Instances offer slightly lower discounts but can be exchanged for different configurations during the term. All require one-year or three-year terms. Savings Plans with an hourly commitment of $100 or less can be returned within seven days of purchase, provided the return occurs within the same calendar month; once the calendar month ends, they can no longer be returned. Standard Reserved Instances can be sold on the Reserved Instance Marketplace under certain conditions, including a minimum 30-day holding period and at least one month remaining in the term, though Reserved Instances purchased at a discount or originally acquired from the marketplace cannot be resold. The goal is not to commit all usage — only the stable baseline. Variable and burst capacity should remain On-Demand. When commitments expire, usage silently reverts to full On-Demand pricing, which can also contribute to coverage erosion over time if renewals are not actively managed.

learn more

RDS SQL Server Running Bundled Licensing on Older Instance Families

Databases

Cloud Provider

AWS

Service Name

Amazon RDS

Inefficiency Type

Suboptimal Pricing Model

Amazon RDS for SQL Server has traditionally used a License Included model where the SQL Server license cost is bundled into a single hourly instance price alongside Windows OS licensing, compute resources, and RDS management capabilities. On older generation instance families such as db.R6i, db.M6i, db.R5, and db.M5, this bundled rate offers no visibility into how much of the hourly cost is attributable to licensing versus infrastructure — and the licensing component can represent a substantial portion of the total charge, especially for Standard and Enterprise editions.

Starting with 7th generation instances (db.M7i and db.R7i), AWS introduced an unbundled pricing model that separates infrastructure costs from SQL Server licensing fees, billing them as distinct line items. This structural change can yield significantly lower total costs compared to equivalent previous-generation instances. Additionally, the unbundled model enables the Optimize CPU feature, which allows customers to reduce vCPU count — and therefore licensing charges — while retaining the same physical core count, memory, and IOPS capacity. This is particularly valuable for memory-intensive or IOPS-intensive SQL Server workloads that don't need high vCPU counts but were previously forced to pay for licensing on all provisioned vCPUs.

Organizations running RDS SQL Server on older instance families continue to pay the higher bundled rate unnecessarily. The savings opportunity compounds in Multi-AZ deployments and on larger instance sizes (2xlarge and above), where hyperthreading is disabled by default on 7th generation instances, effectively halving the vCPU count and the associated licensing fees without sacrificing physical core performance.

learn more

Non-Production RDS SQL Server Using Standard or Enterprise Edition Instead of Developer Edition

Databases

Cloud Provider

AWS

Service Name

Amazon RDS

Inefficiency Type

Inefficient Configuration

Amazon RDS for SQL Server uses a License Included pricing model where the hourly instance rate bundles Microsoft SQL Server licensing fees on a per-vCPU basis. When non-production workloads — such as development, testing, staging, QA, or UAT environments — run on Standard or Enterprise editions, they incur these per-vCPU licensing charges even though the workloads do not require a production-grade license. SQL Server licensing is a major component of the total RDS instance cost, and this overhead scales directly with the number of virtual CPUs provisioned.

Since December 2025, Amazon RDS for SQL Server supports Developer Edition, which includes all Enterprise Edition features but is licensed by Microsoft exclusively for non-production use. Developer Edition instances incur only AWS infrastructure costs with no SQL Server licensing fees. Prior to this capability, customers had no option to use Developer Edition on standard RDS and were forced to pay for Standard or Enterprise licenses even in non-production environments. Organizations with multiple non-production environments running Standard or Enterprise editions now have a significant opportunity to eliminate unnecessary licensing costs by migrating to Developer Edition.

Developer Edition on RDS is provisioned through a Custom Engine Version (CEV) approach, which requires a one-time setup per SQL Server version. While this adds initial complexity compared to standard RDS instance creation, the ongoing licensing savings can be substantial — particularly for organizations running several non-production SQL Server instances across development, testing, and staging environments.

learn more

Suboptimal Cache TTL Strategy Causing Repeated Backend Execution

Annapurna Mungara

Databases

Cloud Provider

AWS

Service Name

AWS ElastiCache

Inefficiency Type

Inefficient Configuration

Organizations deploy ElastiCache to reduce load on backend systems — databases, APIs, and compute layers — by serving frequently accessed data from fast in-memory storage. However, when Time-to-Live (TTL) values are misaligned with actual data change patterns, the cache delivers poor hit rates and fails to eliminate backend workload. This creates a particularly costly form of dual waste: the organization pays continuously for ElastiCache infrastructure while simultaneously incurring the full backend compute and database costs that caching was meant to reduce.

This inefficiency is especially insidious because it is not immediately visible in cost reporting. ElastiCache charges appear as expected infrastructure spend, while the failure to meaningfully reduce backend costs goes unnoticed unless teams actively correlate cache hit rates with backend workload. The pattern commonly emerges when caching is deployed with default or arbitrary TTL values without analyzing how frequently the underlying data actually changes. When TTL is set too short relative to data volatility, cache entries expire before they can be reused — a phenomenon known as cache churn — turning the cache into an expensive pass-through layer that adds cost and latency without delivering value.

The cost impact scales directly with traffic volume. High-traffic applications with poor cache hit rates waste significant spend on both caching infrastructure and unnecessary backend processing. Critically, this is distinct from over-provisioning cache capacity; the waste occurs even with properly sized cache nodes if the TTL strategy does not align with data change frequency. Each cache miss incurs three operations — the initial cache check, the backend query, and the cache population step — adding both latency and backend load compared to having no cache at all.

learn more

Excess vCPU Licensing Costs on RDS for SQL Server Instances

Databases

Cloud Provider

AWS

Service Name

AWS RDS

Inefficiency Type

Inefficient Configuration

Amazon RDS for SQL Server uses a License Included pricing model where SQL Server and Windows OS licensing costs are bundled into the per-instance-hour rate — and those licensing costs scale directly with the number of vCPUs on the instance. Many SQL Server workloads, particularly OLTP, reporting, and data warehousing scenarios, are constrained by memory and storage throughput rather than raw CPU capacity. Organizations frequently provision large instance types to obtain the memory or IOPS their workloads require, but in doing so they also pay for a high vCPU count that remains largely underutilized. Because SQL Server licensing often represents the single largest cost component of an RDS for SQL Server instance, paying for unnecessary vCPUs translates directly into wasted licensing spend.

AWS offers an Optimize CPU feature on 7th generation instance classes (M7i and R7i) that allows customers to reduce the active core count on their RDS for SQL Server instances while preserving the same memory and IOPS capacity. On these newer generation instances, hyperthreading is disabled by default, and vCPU reduction is achieved by lowering the physical core count. AWS benchmarks demonstrate that instances with reduced vCPU counts can match the transaction throughput of instances with double the CPU, with utilization remaining within acceptable thresholds. This feature is supported on Enterprise, Standard, and Web editions for instance sizes of 2xlarge and above, with a minimum of 4 vCPUs after optimization. Organizations that have not evaluated or applied this configuration are likely overpaying for SQL Server licensing on every eligible instance in their fleet.

learn more

Idle Azure NAT Gateway Attached to Subnet Without Active Workloads

Shailaja Beeram

Networking

Cloud Provider

Azure

Service Name

Azure NAT Gateway

Inefficiency Type

Unused Resource

Azure NAT Gateways are commonly deployed to provide outbound internet connectivity for resources within virtual network subnets. Over time, the workloads that originally required this outbound access may be scaled down, migrated, or decommissioned entirely. However, the NAT Gateway often remains attached to the subnet — continuing to incur hourly charges even when no active resources are using it. Because billing begins the moment the resource is created and continues for every hour it exists, an idle NAT Gateway generates a steady, fixed cost with zero functional return.

This waste pattern is particularly common in development, testing, and staging environments where infrastructure is provisioned for temporary workloads but networking components are not included in cleanup processes. NAT Gateways are subnet-level networking primitives, often provisioned by platform or infrastructure teams separately from the application teams that use them. This organizational separation creates gaps in ownership and cleanup responsibility, allowing idle gateways to persist unnoticed. Additionally, NAT Gateway has no stopped or paused state — the only way to stop billing is to delete the resource entirely. Even seemingly idle subnets can generate small data processing charges from background processes such as operating system updates or monitoring agents, which may create a misleading appearance of utilization and further delay cleanup.

The cost impact compounds when organizations maintain multiple idle NAT Gateways across subscriptions and environments. Each gateway also typically has an associated public IP address that incurs its own separate hourly charge, adding to the waste.

learn more

Orphaned Private Endpoints After Target Service Deletion

Shailaja Beeram

Networking

Cloud Provider

Azure

Service Name

Azure Private Link

Inefficiency Type

Unused Resource

Azure Private Endpoints are network interfaces that provide private connectivity from a virtual network to Azure PaaS services such as Storage Accounts, SQL Databases, or Key Vaults. When the target service behind a private endpoint is deleted, migrated, or replaced, the private endpoint itself is not automatically removed. Instead, it transitions to a disconnected state and persists as an orphaned network resource that continues to incur hourly charges. Because private endpoints are network-layer resources managed separately from the application resources they connect to, they are frequently overlooked when services are decommissioned — particularly when the service owner and the network owner are different teams.

This pattern is especially common in development and testing environments where resources are created and destroyed frequently, but networking components are not consistently cleaned up as part of the resource lifecycle. It also occurs during production migrations, service replacements, or architecture changes. Over time, these orphaned endpoints accumulate silently — occupying private IP addresses in subnets, contributing to IP address exhaustion, and generating ongoing charges with no functional benefit. Because each private endpoint maps to a specific target resource and subresource (for example, separate endpoints are required for blob versus file storage on the same storage account), even a single decommissioned service can leave behind multiple orphaned endpoints.

learn more

AWS Marketplace Annual Subscriptions Reverting to Pay-As-You-Go Rates

Other

Cloud Provider

AWS

Service Name

AWS Marketplace

Inefficiency Type

Suboptimal Pricing Model

When organizations purchase third-party software through AWS Marketplace using annual subscriptions, they typically receive meaningful discounts compared to hourly pay-as-you-go (PAYG) pricing. However, when these annual subscriptions expire without active renewal, billing automatically reverts to the default hourly PAYG rate — which can be substantially higher. This is not a renewal at a higher rate; it is the absence of a renewal action that causes the subscription to lapse and the costlier pricing tier to take effect. Because the subscription simply expires silently, many teams do not realize they have lost their discounted rate until the cost increase appears in the next billing cycle.

This inefficiency is especially difficult to manage in enterprise environments where multiple Marketplace subscriptions are purchased at irregular intervals throughout the year, each with its own expiration date. Private offers — which provide custom-negotiated pricing — add further complexity because they cannot auto-renew by design; when a private offer expires, the customer either moves to the product's higher public pricing or loses the subscription entirely. The financial impact can be severe: in some cases, the licensing cost at PAYG rates can exceed the cost of the underlying compute infrastructure itself, as commonly seen with enterprise software such as SUSE Linux for SAP workloads.

Additionally, for AMI-based products, annual subscriptions are tied to specific instance types. Changing instance types during the subscription period causes billing to revert to hourly rates for the new type, creating another avenue for unintended cost increases even before the subscription formally expires.

learn more

Orphaned Azure Function Apps with No Active Functions or Triggers

Compute

Cloud Provider

Azure

Service Name

Azure Functions

Inefficiency Type

Unused Resource

Azure Function apps can persist long after the applications or workflows they supported have been retired — particularly in development, testing, and experimentation environments where cleanup is often overlooked. Even when no functions are deployed or no triggers are active, the underlying infrastructure dependencies continue to generate charges. The nature and severity of this waste depends heavily on the hosting plan type: function apps on Premium or Dedicated (App Service) plans incur continuous compute charges for allocated instances regardless of activity, while even Consumption plan function apps still require an associated storage account that accrues transaction and capacity costs from internal runtime operations.

Each function app is provisioned with a required Azure Storage account used for storing function code, managing triggers, and maintaining execution state. This storage account generates costs through read/write transactions and capacity usage even when the function app is completely idle — driven by the Functions runtime's internal health checks and state management. Additionally, if Application Insights was enabled for monitoring, telemetry data ingestion charges can accumulate silently in the background. Across an organization with dozens of abandoned function apps spanning multiple subscriptions, these individually modest charges compound into meaningful and entirely avoidable waste.

learn more

Fixed Instance Count on Virtual Machine Scale Set Without Autoscaling

Compute

Cloud Provider

Azure

Service Name

Azure Virtual Machine Scale Sets

Inefficiency Type

Inefficient Configuration

Azure Virtual Machine Scale Sets can operate in two modes: manual scaling with a fixed instance count, or autoscaling with dynamic instance counts that respond to demand. When a scale set is configured with manual scaling, it maintains the same number of VM instances at all times — regardless of whether those instances are actively processing workload. Every provisioned instance continues to incur per-second compute charges, meaning the organization pays for full capacity even during off-peak hours, weekends, or seasonal lulls when only a fraction of that capacity is needed.

This pattern is especially wasteful for workloads with variable demand — web applications with daily traffic cycles, batch processing jobs that run at specific intervals, or services with clear seasonal peaks. If a scale set is sized for peak demand but runs at that capacity around the clock, the gap between provisioned resources and actual utilization translates directly into unnecessary spend. Microsoft explicitly identifies autoscaling as a mechanism to reduce scale set costs by running only the number of instances required to meet current demand.

There are legitimate reasons to maintain fixed capacity — stateful applications that cannot tolerate dynamic instance changes, workloads with licensing constraints tied to specific instance counts, or scenarios where consistent performance without scale-up latency is critical. However, many scale sets running at fixed capacity do so simply because autoscaling was never configured, not because it was deliberately excluded. Identifying and addressing these cases represents a significant cost optimization opportunity.

learn more

Azure Firewall Premium SKU Deployed Without Using Premium Features

Networking

Cloud Provider

Azure

Service Name

Azure Firewall

Inefficiency Type

Overprovisioned Resource

Azure Firewall is available in three SKUs — Basic, Standard, and Premium — each designed for different security requirements and priced accordingly. The Premium SKU includes advanced threat protection capabilities such as TLS inspection, signature-based intrusion detection and prevention (IDPS), URL filtering, and web categories. These features are designed for highly sensitive and regulated environments, such as those processing payment card data or requiring PCI DSS compliance. However, many organizations deploy the Premium SKU by default — often during initial provisioning or as a precautionary measure — without actively configuring or requiring any of these Premium-exclusive features.

The cost impact is significant because the Premium SKU carries a substantially higher fixed hourly deployment charge compared to the Standard SKU — approximately 40% more — while the per-gigabyte data processing rate remains the same across both tiers. Since this hourly charge accrues continuously regardless of whether Premium features are enabled or traffic is flowing, every firewall instance running on the Premium SKU without leveraging its advanced capabilities represents a persistent and avoidable cost premium. In organizations with multiple firewall deployments across subscriptions and environments, this waste compounds quickly.

This pattern is especially common in non-production environments such as development and staging, where advanced threat protection features like TLS inspection and IDPS provide little practical value. Microsoft has recognized this as a frequent optimization opportunity and introduced a zero-downtime SKU change feature specifically to simplify the downgrade process from Premium to Standard.

learn more

Idle or Underutilized Azure Bastion Deployment

Networking

Cloud Provider

Azure

Service Name

Azure Bastion

Inefficiency Type

Underutilized Resource

Azure Bastion incurs continuous hourly charges from the moment it is deployed until the resource is deleted — regardless of whether any connections are actively being made. This means a Bastion host sitting idle in a development or test environment generates the same cost as one actively serving remote sessions. Because there is no ability to pause or stop a Bastion deployment, the only way to eliminate charges is to delete the resource entirely.

This inefficiency is especially common in non-production environments where Bastion may have been provisioned for occasional troubleshooting or administrative access but then left running indefinitely. Teams often deploy Bastion during initial environment setup and forget about it, or assume it only costs money when sessions are active. Over time, these idle deployments quietly accumulate significant charges — particularly when deployed at the Basic, Standard, or Premium SKU tiers, which use dedicated infrastructure and carry meaningful hourly rates.

The cost impact compounds across an organization with multiple subscriptions or environments. A single idle Bastion host may seem modest in isolation, but dozens of forgotten deployments across dev, test, staging, and sandbox environments can represent a substantial and entirely avoidable expense.

learn more

Overprovisioned Azure NetApp Files Capacity Pools

Storage

Cloud Provider

Azure

Service Name

Azure NetApp Files

Inefficiency Type

Overprovisioned Resource

Azure NetApp Files bills based on provisioned capacity pool size — not on the actual data stored within volumes. This means that when a capacity pool is provisioned at a size significantly larger than the sum of volume quotas allocated within it, the organization pays for stranded, unallocated capacity every hour. For example, a 10 TiB capacity pool with only 6 TiB of volume quotas allocated has 4 TiB of capacity that generates cost but serves no purpose.

This overprovisioning commonly occurs for several reasons. Capacity pools do not automatically shrink — since April 2021, pool sizing is entirely a manual customer responsibility. When volumes are deleted, the freed capacity remains in the pool unless an administrator explicitly resizes it downward. Additionally, with auto QoS pools, volume quotas directly determine throughput performance, which incentivizes teams to set larger quotas than their data requires, further inflating pool sizes. Over time, these dynamics create a growing gap between provisioned pool capacity and what is actually needed, resulting in persistent, avoidable charges that compound across multiple pools and regions.

learn more

Overprovisioned Azure Cache for Redis Instance

Databases

Cloud Provider

Azure

Service Name

Azure Cache for Redis

Inefficiency Type

Overprovisioned Resource

Azure Cache for Redis is billed at a fixed rate determined entirely by the provisioned tier and cache size — not by actual utilization. A cache instance that consumes only a fraction of its available memory and throughput incurs the same cost as one running at full capacity. This means that when a cache is sized larger than the workload demands, the unused memory and throughput headroom represent pure waste with no corresponding benefit.

Overprovisioning commonly occurs when teams size caches for anticipated peak loads that never materialize, or when workload patterns shift over time — such as after a migration, application refactor, or traffic decline — without a corresponding review of cache sizing. Because there is no option to stop or pause billing on a cache instance, and charges accrue continuously from the moment the cache is created until it is deleted, oversized caches quietly accumulate unnecessary costs around the clock.

An important constraint compounds this issue: scaling down between tiers is not supported. An organization that initially provisions a Premium-tier cache but later determines that a Standard tier would suffice cannot simply downgrade in place — it must create a new cache at the appropriate tier and migrate data. This friction often delays right-sizing efforts and prolongs overspend.

learn more

Idle or Untriggered Azure Logic Apps Generating Continuous Charges

Other

Cloud Provider

Azure

Service Name

Azure Logic Apps

Inefficiency Type

Unused Resource

Azure Logic Apps can quietly accumulate costs even when no workflows are actively executing, but the mechanism differs significantly depending on the deployment model. In the Consumption (multitenant) plan, Logic Apps with polling triggers continue to generate billable trigger executions every time the trigger checks for events — even when no events are found and no workflow runs are initiated. A polling trigger configured to check every 30 seconds produces thousands of billable executions per day, all charged at the per-execution rate, regardless of whether any useful work is performed. Webhook or push-based triggers avoid this particular waste, but retained run history and storage operations can still accrue minor costs over time.

In the Standard (single-tenant) plan, the cost driver is fundamentally different. Customers pay for reserved compute capacity — vCPU and memory — on an hourly basis, whether or not any workflows execute. An idle Standard Logic App incurs the full hosting plan charges around the clock. Disabling a Standard Logic App prevents triggers from firing but does not stop the hosting plan billing; only deletion or consolidation of the underlying plan reduces costs.

These idle Logic Apps commonly arise after application decommissioning, migration projects, or proof-of-concept work that was never cleaned up. At enterprise scale, where dozens or hundreds of Logic Apps may exist across multiple environments, the cumulative waste from untriggered workflows and unused hosting plans can become substantial — particularly when the resources are spread across teams and subscriptions with no centralized review process.

learn more

ECR Archive Storage Class Used Below 150 TB Threshold

Storage

Cloud Provider

AWS

Service Name

AWS ECR

Inefficiency Type

Inefficient Configuration

In November 2025, AWS introduced an Archive storage class for private ECR repositories, marketed as a way to reduce storage costs for large volumes of rarely used container images. However, Archive storage pricing is identical to Standard storage pricing for the first 150 TB per month. Below this threshold, Archive provides no storage savings yet introduces a per-gigabyte retrieval charge, a retrieval delay of up to 20 minutes, and a 90-day minimum storage duration. Adopting the Archive storage class before meeting the 150 TB threshold means paying the same storage price but taking on additional fees and operational overhead.

This inefficiency is easy to miss because the AWS announcement emphasized cost savings for "large volumes" without quantifying "large" or prominently disclosing the retrieval charge or the minimum storage duration. In other AWS services, optional storage classes typically offer a storage price reduction from the first byte, in exchange for access penalties. With ECR, however, access penalties apply as described, but the storage price is unchanged for the first 150 TB, a container storage volume that few organizations achieve.

learn more

S3 Standard - Infrequent Access Used Where Intelligent Tiering Would Be Cheaper

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Suboptimal Pricing Model

Organizations often use the Standard - Infrequent Access (Standard-IA) storage class based on documentation and code that predate 2021 updates to the Intelligent Tiering storage class. Intelligent Tiering became suitable as an initial S3 storage class even for objects that are small and/or will be deleted early. It also gained a heavily-discounted access tier. Older internal runbooks, lifecycle policies (including ones specified in infrastructure-as-code templates), scripts, programs, and public examples may still default to Standard-IA, inflating storage costs.

This inefficiency report compares Standard-IA with Intelligent Tiering. It is not intended to cover other storage classes. S3 storage is billed per gibibyte or GiB (powers of 2) rather than per gigabyte or GB (powers of 10), which matters for small objects and also for large volumes of storage.

Relative to the Standard storage class, the Standard-IA storage class offers a moderate, constant storage price discount but imposes a minimum billable object size of 128 KiB, a minimum storage duration of 30 days, and a per-GiB retrieval charge.

In contrast, AWS updated the Intelligent Tiering storage class in September, 2021, eliminating the minimum storage duration and exempting small objects from a monthly per-object monitoring and automation charge. Intelligent Tiering never had retrieval charges. In November, 2021, AWS added the heavily-discounted Archive Instant Access tier.

For objects stored beyond a few months, Intelligent Tiering's progressive storage price discounts surpass Standard-IA's constant discount. Storage savings accumulate each month. Objects in the Intelligent Tiering storage class automatically move through progressively cheaper access tiers unless the objects are accessed. Intelligent Tiering also avoids Standard-IA's minimum billable object size and minimum storage duration penalties.

learn more

Non-Production App Service Plans Running Higher Tiers During Off-Hours

Compute

Cloud Provider

Azure

Service Name

Azure App Service

Inefficiency Type

Inefficient Configuration

Azure App Service Plans define the compute resources allocated to web applications and are billed continuously based on their pricing tier — regardless of whether the hosted apps are actively serving traffic. In non-production environments such as development, testing, or staging, workloads typically follow predictable usage patterns aligned with business hours. When these plans remain provisioned at higher-cost tiers around the clock, organizations pay premium rates for compute capacity that sits idle during evenings, weekends, and holidays.

A common misconception is that stopping the apps within a plan will halt charges. In reality, the App Service Plan itself is the billing container, and charges accrue as long as the plan exists at a dedicated tier — even with all apps stopped or deleted. Simply stopping apps provides no cost relief. Instead, the plan's tier must be actively changed to a lower-cost option during periods of inactivity to realize savings. This temporal tier-switching pattern is distinct from scaling out (adjusting instance count) or right-sizing (choosing a permanently smaller tier), and is particularly effective for non-production workloads where brief interruptions during tier transitions are acceptable.

Because higher tiers such as Premium or Standard carry significantly higher per-hour rates than Basic tier, leaving these plans unchanged during extended idle periods represents a significant and avoidable expense. Organizations with multiple non-production App Service Plans can accumulate substantial waste if this pattern is not addressed.

learn more

Overcommitted Savings Plans After Temporary AI Inference Demand Spikes

Compute

Cloud Provider

AWS

Service Name

AWS Savings Plans

Inefficiency Type

Suboptimal Pricing Model

When organizations purchase AWS Savings Plans during periods of elevated AI inference demand — such as experimentation phases, feature launches, or early adoption surges — the committed hourly spend may significantly exceed what is needed once workloads stabilize. GPU-backed inference clusters running on high-cost instance families can drive substantial compute consumption during these peaks, and if that peak usage is used as the baseline for commitment sizing, the resulting Savings Plan will be oversized relative to steady-state demand. Because Savings Plans are billed as a fixed hourly dollar commitment for the entire term, any unused portion in a given hour is forfeited — it cannot be carried over, recouped, or applied to future hours.

This pattern is especially costly for AI inference workloads because GPU-accelerated instances carry significantly higher hourly rates than general-purpose compute, amplifying the financial impact of each underutilized hour. The problem compounds when inference workloads shift between instance families, regions, or deployment architectures over time — a common occurrence as teams optimize models, adopt newer hardware generations, or consolidate serving infrastructure. EC2 Instance Savings Plans, which are scoped to a specific instance family and region, are particularly vulnerable to these shifts. Critically, Savings Plans cannot be canceled, modified, or sold on any marketplace once purchased, making the commitment irrevocable for the full term with only a narrow return window available under limited conditions.

The net result is a sustained gap between committed spend and actual covered usage, eroding the discount benefit that justified the commitment in the first place. In cases of sustained underutilization, the effective discount achieved by the Savings Plan can be materially reduced, undermining the expected financial benefit of the commitment.

learn more

Orphaned Cloud Storage from Dropped External Delta Tables in Databricks

Annapurna Mungara

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

When external Delta tables are dropped from Databricks Unity Catalog or the legacy Hive metastore, only the table metadata is removed — the underlying data files in cloud object storage (such as S3, ADLS, or GCS) remain untouched and continue to incur per-GB-month storage charges. This behavior is by design: external tables decouple metadata from data lifecycle management, meaning Databricks explicitly does not delete the underlying storage when an external table is dropped. The result is orphaned storage — files that no longer have any catalog reference, are not consumed by any downstream pipeline, and deliver no business value, yet continue to accumulate charges indefinitely.

This pattern is particularly prevalent in environments using medallion architecture (bronze/silver/gold layers), where tables are frequently recreated during pipeline evolution, schema experimentation, or migration between environments. Development and test workloads compound the problem, as teams routinely create and abandon external table references without cleaning up the associated storage. Unlike managed tables in Unity Catalog — which have a retention period with recovery capability before automatic deletion — external tables offer no such safety net. The orphaned storage is structurally invisible to standard cost dashboards because it appears as generic object storage charges, not as Databricks-specific line items. Over time, this silent accumulation can represent a meaningful share of an organization's total storage spend.

Importantly, Databricks VACUUM operations do not address this pattern. VACUUM cleans up old file versions within active Delta tables, but it cannot act on storage paths that have been completely disconnected from catalog metadata through external table drops. The only way to reclaim this storage is to manually identify and delete the orphaned files in cloud storage.

learn more

Idle Azure Load Balancers in Non-Production Environments

Networking

Cloud Provider

Azure

Service Name

Azure Load Balancer

Inefficiency Type

Idle Resource

This inefficiency occurs when Azure Load Balancers remain provisioned after the backend workloads they supported have been scaled down, stopped, or decommissioned. This is common in non-production environments where virtual machines are shut down outside business hours, but the associated load balancers are left in place. Even when no meaningful traffic is flowing, the load balancer continues to incur base charges, resulting in ongoing cost without delivering value.

learn more

Orphaned RDS Backup Storage After Database Deletion

Chandan Bukkapatnam

Databases

Cloud Provider

AWS

Service Name

AWS RDS

Inefficiency Type

Orphaned backup storage

This inefficiency occurs when an RDS database instance is deleted but its manual snapshots or retained backups remain. Unlike automated backups tied to a live instance, these backups persist independently and continue generating storage costs despite no longer supporting any active database. This is distinct from excessive retention on active databases and typically arises from incomplete cleanup during decommissioning.

learn more

Overselecting Data and Misusing LIMIT for Cost Control in BigQuery

Benjamin van der Maas

Other

Cloud Provider

GCP

Service Name

GCP BigQuery

Inefficiency Type

Excessive data processed

This inefficiency occurs when analysts use SELECT * (reading more columns than needed) and/or rely on LIMIT as a cost-control mechanism. In BigQuery, projecting excess columns increases the amount of data read and can materially raise query cost, particularly on wide tables and frequently-run queries. Separately, applying LIMIT to a query does not inherently reduce bytes processed for non-clustered tables; it mainly caps the result set returned. The “LIMIT saves cost” assumption is only sometimes true on clustered tables, where BigQuery may be able to stop scanning earlier once enough clustered blocks have been read.

learn more

Overprovisioned Azure Virtual WAN Hub Capacity

Compute

Cloud Provider

Azure

Service Name

Azure App Service Plans

Inefficiency Type

Overprovisioned compute capacity

This inefficiency occurs when an App Service Plan is sized larger than required for the applications it hosts. Plans are often provisioned conservatively to handle anticipated peak demand and are not revisited after workloads stabilize. Because pricing is tied to the plan’s SKU rather than real-time usage, oversized plans continue to incur higher costs even when CPU and memory utilization remain consistently low.

learn more

Overprovisioned Azure Virtual WAN Hub Capacity

Networking

Cloud Provider

Azure

Service Name

Azure Virtual WAN

Inefficiency Type

Overprovisioned network capacity

This inefficiency occurs when an Azure Virtual WAN hub is provisioned with more capacity than required to support real network traffic. Because hub costs scale with the number of configured scale units, overprovisioned hubs continue to incur higher charges even when traffic levels remain consistently low. This commonly happens when hubs are sized for peak or anticipated demand that never materializes, or when traffic patterns change over time without corresponding capacity adjustments.

learn more

Inefficient Lambda Pricing Model for Steady High-Volume Workloads (Use Lambda Managed Instances)

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Suboptimal billing model selection

This inefficiency occurs when a function has steady, high-volume traffic (or predictable load) but continues running on default Lambda pricing, where costs scale with execution duration. Lambda Managed Instances runs Lambda on EC2 capacity managed by Lambda and supports multi-concurrent invocations within the same execution environment, which can materially improve utilization for suitable workloads (often IO-heavy services). For these steady-state patterns, shifting from duration-based billing to instance-based billing (and potentially leveraging EC2 pricing options like Savings Plans or Reserved Instances) can reduce total cost—while keeping the Lambda programming model. Savings are workload-dependent and not guaranteed.

learn more

Suboptimal Service Tier Selection in Azure SQL Managed Instance

Databases

Cloud Provider

Azure

Service Name

Azure SQL Managed Instance

Inefficiency Type

Suboptimal service tier selection

This inefficiency occurs when Azure SQL Managed Instances continue running on legacy General Purpose or Business Critical tiers despite the availability of the next-gen General Purpose tier. The newer tier enables more granular scaling of vCPU, memory, and storage, allowing workloads to better match actual resource needs. In many cases, workloads running on Business Critical—or overprovisioned legacy General Purpose—do not require the premium performance or architecture of those tiers and could achieve equivalent outcomes at lower cost by moving to next-gen General Purpose.

learn more

Idle Recovery Services Vault Backups and Suboptimal Backup Storage Tiering

Deepak Sreedhar

Storage

Cloud Provider

Azure

Service Name

Azure Recovery Services Vault

Inefficiency Type

Orphaned backup data and inefficient storage tiering

This inefficiency occurs when backup data remains in a Recovery Services Vault after the original protected resource has been deleted. These orphaned backups continue to consume storage and generate cost despite no longer supporting an active workload. In addition, long-retained backups that are rarely accessed are often kept in higher-cost tiers, increasing storage spend without providing additional value.

learn more

Reduced Correction Window When Purchasing AWS Savings Plans Late in the Month

Compute

Cloud Provider

AWS

Service Name

AWS EC2

Inefficiency Type

Commitment risk due to timing constraints

This inefficiency occurs when Savings Plans are purchased within the final days of a calendar month, reducing or eliminating the ability to reverse the purchase if errors are discovered. Because the refund window is constrained to both a 7-day period and the same month, late-month purchases materially limit correction options. This increases the risk of locking in misaligned commitments (e.g., incorrect scope, amount, or term), which can lead to sustained underutilization and unnecessary long-term spend.

learn more

Inactive Licensed Users in Azure DevOps Organization

Deepak Sreedhar

Other

Cloud Provider

Azure

Service Name

Azure DevOps

Inefficiency Type

Unused licensed users

This inefficiency occurs when licensed Azure DevOps users remain assigned after individuals leave the organization or stop using the platform. These inactive users continue to generate recurring per-user charges despite providing no ongoing value, leading to unnecessary spend over time.

learn more

Non-Qualifying AWS Marketplace SaaS Spend Counting Toward Commitments

Chinonso Okafor

Other

Cloud Provider

AWS

Service Name

AWS Marketplace

Inefficiency Type

Commitment eligibility misclassification

This inefficiency occurs when teams assume AWS Marketplace SaaS purchases will contribute toward EDP or PPA commitments, but the SaaS product is not eligible under AWS’s “Deployed on AWS” standard. As of May 1, 2025, AWS Marketplace allows SaaS products regardless of where they are hosted, while separately identifying products that qualify for commitment drawdown via a visible “Deployed on AWS” badge.

Eligibility is determined based on the invoice date, not the contract signing date. As a result, Marketplace SaaS contracts signed prior to the policy change may still generate invoices after May 1, 2025 that no longer qualify for commitment retirement. This can lead to Marketplace spend appearing on AWS invoices without reducing commitments, creating false confidence in commitment progress and increasing the risk of end-of-term shortfalls.

learn more

Spot-Only GKE Capacity Without Standard Fallback

Hierony Manurung

Compute

Cloud Provider

GCP

Service Name

GCP GKE

Inefficiency Type

Availability-driven waste

This inefficiency occurs when workloads are constrained to run only on Spot-based capacity with no viable path to standard nodes when Spot capacity is reclaimed or unavailable. While Spot reduces unit cost, rigid dependence can create hidden costs by requiring standby standard capacity elsewhere, delaying deployments, or increasing operational intervention to keep environments usable. GKE explicitly recommends mixing Spot and standard node pools for continuity when Spot is unavailable.

learn more

Stale Completed or Failed Fargate Pods Causing Direct Billing and Capacity Waste

Compute

Cloud Provider

AWS

Service Name

AWS EKS

Inefficiency Type

Unnecessary compute and networking charges

This inefficiency occurs when Kubernetes Jobs or CronJobs running on EKS Fargate leave completed or failed pod objects in the cluster indefinitely. Although the workload execution has finished, AWS keeps the underlying Fargate microVM running to allow log inspection and final status checks. As a result, vCPU, memory, and networking resources remain allocated and billable until the pod object is explicitly deleted.

Over time, large numbers of stale Job pods can generate direct compute charges as well as consume ENIs and IP addresses, leading to both unnecessary spend and capacity pressure. This pattern is common in batch-processing and scheduled workloads that lack automated cleanup.

learn more

Outdated ElastiCache Engine Version Incurring Extended Support Charges

Databases

Cloud Provider

AWS

Service Name

AWS ElastiCache

Inefficiency Type

Extended support surcharge

This inefficiency occurs when ElastiCache clusters continue running engine versions that have moved into extended support. While the service remains functional, AWS charges an ongoing premium for extended support that provides no added performance or capability. These costs are typically avoidable by upgrading to a version within standard support.

learn more

Missed Use of Committed Use Discounts for Compute Engine

Compute

Cloud Provider

GCP

Service Name

GCP Compute Engine

Inefficiency Type

Suboptimal pricing model selection

This inefficiency occurs when workloads with predictable, long-running compute usage continue to run entirely on on-demand pricing instead of leveraging Committed Use Discounts. For stable environments, such as production services or continuously running batch workloads, failing to apply CUDs results in materially higher compute spend without any operational benefit. The inefficiency is driven by pricing choice, not resource overuse.

learn more

Azure Backup Data Retained Beyond Intended Retention Period

Storage

Cloud Provider

Azure

Service Name

Azure Backup

Inefficiency Type

Excessive backup retention

This inefficiency occurs when backup data persists longer than intended due to misaligned or outdated retention policies. It often arises when retention requirements change over time, but older recovery points are not evaluated or cleaned up accordingly. In some cases, manually configured backups or legacy policies remain in place even after operational or compliance needs have been reduced.

As a result, backup storage continues to grow and incur cost without delivering additional recovery value.

learn more

Automatic Restart of Stopped Aurora Clusters Causing Unintended Compute Charges

Databases

Cloud Provider

AWS

Service Name

AWS Aurora

Inefficiency Type

Unintended resource reactivation

This inefficiency occurs when Amazon Aurora database clusters are intentionally stopped to avoid compute costs but are automatically restarted by the service after the maximum allowed stop period. Once restarted, re-started database instances begin accruing instance-hour charges even if the database is not needed.

Because Aurora does not provide native lifecycle controls to keep clusters stopped indefinitely, this behavior can result in recurring, unintended compute spend—particularly in non-production, seasonal, or infrequently accessed environments where clusters are stopped and forgotten.

learn more

Excessive Automated Backup Retention in Cloud SQL

Hierony Manurung

Databases

Cloud Provider

GCP

Service Name

Cloud SQL

Inefficiency Type

Excessive Data Retention

This inefficiency occurs when automated Cloud SQL backups are retained longer than required by recovery objectives or governance needs. Because backups accumulate over the retention window (and can grow quickly for high-change databases), excessive retention drives ongoing backup storage charges without improving practical recoverability.

learn more

Mixing Production and Non-Production Applications in the Same App Service Plan

Compute

Cloud Provider

Azure

Service Name

Azure App Service Plans

Inefficiency Type

Inefficient environment isolation

This inefficiency occurs when production and non-production applications are hosted within the same App Service Plan. Production workloads often require higher availability, performance, or scaling characteristics, driving the plan toward larger or higher-cost SKUs. When non-production workloads share that plan, they inherit the higher cost structure even though their availability and performance requirements are typically much lower, resulting in unnecessary spend.

learn more

Fargate Resource Rounding and Per-Pod Overhead Driving Step-Up Costs

Compute

Cloud Provider

AWS

Service Name

AWS EKS

Inefficiency Type

Suboptimal resource sizing

This inefficiency occurs when pod resource requests—often inflated by sidecar containers—push total memory or CPU just over a Fargate sizing boundary. Because Fargate adds mandatory system overhead and only supports fixed resource combinations, small incremental increases can force a pod into a much larger billing tier. This results in materially higher cost for marginal additional resource needs, especially in workloads that run continuously or at scale.

learn more

Unnecessary Lambda Provisioned Concurrency on Low-Utilization Functions

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Unused reserved capacity

This inefficiency occurs when Provisioned Concurrency is enabled for Lambda functions that do not require consistently low latency or steady traffic. In such cases, reserved capacity remains allocated and billed during idle periods, creating ongoing cost without proportional performance or business benefit. This is distinct from standard Lambda execution charges, which are purely usage-based.

learn more

Retained Azure Backup Data After Resource Decommissioning

Storage

Cloud Provider

Azure

Service Name

Azure Backup

Inefficiency Type

Orphaned backup data

This inefficiency occurs when a protected resource (such as a virtual machine, database, or file share) is decommissioned without explicitly stopping backup protection. In these cases, Azure Backup continues to retain existing recovery points in the vault until the retention policy expires. Although the source resource no longer exists, backup storage remains allocated and billable, resulting in unnecessary ongoing costs.

This pattern is common when infrastructure is deleted outside of a formal decommissioning process or when backup ownership is unclear.

learn more

Underutilized Azure Savings Plan Due to Overly Narrow Scope

Compute

Cloud Provider

Azure

Service Name

Azure Virtual Machines

Inefficiency Type

Commitment underutilization due to scope configuration

This inefficiency occurs when an Azure Savings Plan is scoped too narrowly relative to where eligible compute usage actually runs. When usage is spread across multiple subscriptions or fluctuates significantly (for example, development and test workloads that are frequently stopped and started), a narrowly scoped Savings Plan may not consistently find enough eligible usage to consume the full commitment. As a result, part of the committed hourly spend goes unused while other eligible workloads outside the scope continue to incur on-demand charges.

Azure supports broader scoping options—such as Management Group or Shared scope—that allow the commitment to be applied across a larger pool of eligible compute. Selecting an overly restrictive scope can therefore directly drive underutilization, even when sufficient total usage exists across the tenant.

learn more

Using High-Cost Models for Low-Complexity Tasks

AI

Cloud Provider

GCP

Service Name

GCP Vertex AI

Inefficiency Type

Overpowered Model Selection

Vertex AI workloads often include low-complexity tasks such as classification, routing, keyword extraction, metadata parsing, document triage, or summarization of short and simple text. These operations do **not** require the advanced multimodal reasoning or long-context capabilities of larger Gemini model tiers. When organizations default to a single high-end model (such as Gemini Ultra or Pro) across all applications, they incur elevated token costs for work that could be served efficiently by **Gemini Flash** or smaller task-optimized variants. This mismatch is a common pattern in early deployments where model selection is driven by convenience rather than workload-specific requirements. Over time, this creates unnecessary spend without delivering measurable value.

learn more

Using High-Cost Bedrock Models for Low-Complexity Tasks

AI

Cloud Provider

AWS

Service Name

AWS Bedrock

Inefficiency Type

Overpowered Model Selection

Many Bedrock workloads involve low-complexity tasks such as tagging, classification, routing, entity extraction, keyword detection, document triage, or lightweight summarization. These tasks **do not require** the advanced reasoning or generative capabilities of higher-cost models such as Claude 3 Opus or comparable premium models. When organizations default to a high-end model across all applications—or fail to periodically reassess model selection—they pay elevated costs for work that could be performed effectively by smaller, lower-cost models such as Claude Haiku or other compact model families. This inefficiency becomes more pronounced in high-volume, repetitive workloads where token counts scale quickly.

learn more

Always-On PTUs for Seasonal or Cyclical Azure OpenAI Workloads

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Unnecessary Continuous Provisioning

Many Azure OpenAI workloads—such as reporting pipelines, marketing workflows, batch inference jobs, or time-bound customer interactions—only run during specific periods. When PTUs remain fully provisioned 24/7, organizations incur continuous fixed cost even during extended idle time. Although Azure does not offer native PTU scheduling, teams can use automation to provision and deprovision PTUs based on predictable cycles. This allows them to retain performance during peak windows while reducing cost during low-activity periods.

learn more

Non-Production Azure OpenAI Deployments Using PTUs Instead of PAYG

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Misaligned Pricing Model

Development, testing, QA, and sandbox environments rarely have the steady, predictable traffic patterns needed to justify PTU deployments. These workloads often run intermittently, with lower throughput and shorter usage windows. When PTUs are assigned to such environments, the fixed hourly billing generates continuous cost with little utilization. Switching non-production workloads to PAYG aligns cost with actual usage and eliminates the overhead of managing PTU quota in low-stakes environments.

learn more

Underutilized PTU Quota for Azure OpenAI Deployments

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Overprovisioned Capacity Allocation

When organizations size PTU capacity based on peak expectations or early traffic projections, they often end up with more throughput than regularly required. If real-world usage plateaus below provisioned levels, a portion of the PTU capacity remains idle but still generates full spend each hour. This is especially common shortly after production launch or during adoption of newer GPT-4 class models, where early conservative sizing leads to long-term over-allocation. Rightsizing PTUs based on observed usage patterns ensures that capacity matches actual demand.

learn more

Suboptimal Bedrock Inference Profile Model

Ariel Lichterman

AI

Cloud Provider

AWS

Service Name

AWS Bedrock

Inefficiency Type

Outdated Model Selection

AWS frequently updates Bedrock with improved foundation models, offering higher quality and better cost efficiency. When workloads remain tied to older model versions, token consumption may increase, latency may be higher, and output quality may be lower. Using outdated models leads to avoidable operational costs, particularly for applications with consistent or high-volume inference activity. Regular modernization ensures applications take advantage of new model optimizations and pricing improvements.

learn more

Missing Reserved PTUs for Steady-State Azure OpenAI Workloads

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Unoptimized Pricing Model

Many production Azure OpenAI workloads—such as chatbots, inference services, and retrieval-augmented generation (RAG) pipelines—use PTUs consistently throughout the day. When usage stabilizes after initial experimentation, continuing to rely on on-demand PTUs results in ongoing unnecessary spend. These workloads are strong candidates for reserved PTUs, which provide identical performance guarantees at a substantially reduced hourly rate. Migrating to reservations usually requires no architectural changes and delivers immediate cost savings.

learn more

Suboptimal Azure OpenAI Model Type

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Outdated Model Selection

Azure releases newer OpenAI models that provide better performance and cost characteristics compared to older generations. When workloads remain on outdated model versions, they may consume more tokens to produce equivalent output, run slower, or miss out on quality improvements. Because customers pay per token, using an older model can lead to unnecessary spending and reduced value. Aligning deployments to the most current, efficient model types helps reduce spend and improve application performance.

learn more

Using High-Cost Models for Low-Complexity Tasks

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Overpowered Model Selection

Some workloads — such as text classification, keyword extraction, intent detection, routing, or lightweight summarization — do not require the capabilities of the most advanced model families. When high-cost models are used for these simple tasks, organizations pay elevated token rates for work that could be handled effectively by more efficient, lower-cost models. This mismatch typically arises from defaulting to a single model for all tasks or not periodically reviewing model usage patterns across applications.

learn more

Provisioned Throughput OpenAI Deployment in Non-Production Environments

Ariel Lichterman

AI

Cloud Provider

Azure

Service Name

Azure Cognitive Services

Inefficiency Type

Overprovisioned Deployment Model

PTU deployments guarantee dedicated throughput and low latency, but they also require paying for reserved capacity at all times. In non-production environments—such as dev, test, QA, or experimentation—usage patterns are typically sporadic and unpredictable. Deploying PTUs in these environments leads to consistent baseline spend without corresponding value. On-demand deployments scale usage cost with actual consumption, making them more cost-efficient for variable workloads.

learn more

Suboptimal Use of Serverless Compute for Azure SQL Database

Benjamin van der Maas

Databases

Cloud Provider

Azure

Service Name

Azure SQL

Inefficiency Type

Incorrect Compute Tier Selection

Serverless is attractive for variable or idle workloads, but it can become more expensive than Provisioned compute when database activity is high for long portions of the day. As active time increases, per-second compute accumulation approaches—or exceeds—the fixed monthly cost of a Provisioned tier. This inefficiency arises when teams adopt Serverless as a default without assessing workload patterns. Databases with steady demand, predictable traffic, or long active periods often operate more cost-effectively on Provisioned compute. The economic break-even point depends on workload activity, and when that threshold is consistently exceeded, Provisioned becomes the more efficient option.

learn more

Suboptimal Use of Provisioned Compute for Azure SQL Database

Benjamin van der Maas

Databases

Cloud Provider

Azure

Service Name

Azure SQL

Inefficiency Type

Incorrect Compute Tier Selection

Databases deployed on Provisioned compute incur continuous hourly charges even when workload demand is low. For databases that are active only briefly within an hour, or for limited hours per month, Serverless can provide significantly lower cost because it bills only for active compute time. The economic break-even point between Provisioned and Serverless depends on workload activity patterns. If monthly active time falls *below* the conceptual break-even range, Serverless is more cost-effective. If active time regularly exceeds that range, Provisioned may be more appropriate. This inefficiency typically appears when teams default to Provisioned compute without evaluating workload behavior over time.

learn more

Suboptimal Integration Runtime Region Selection in Azure Data Factory

Jurian van Hoorn

Compute

Cloud Provider

Azure

Service Name

Azure Data Factory V2

Inefficiency Type

Cross-Region Data Movement

When Integration Runtimes are configured with the default “Auto Resolve” region setting, Azure may automatically provision them in a region different from the data sources or sinks. For example, an environment deployed in West Europe may run pipelines in US East. This causes unnecessary cross-region data transfer, increasing networking costs and pipeline latency. The inefficiency often goes unnoticed because data transfer costs are billed separately from pipeline compute charges.

learn more

Outdated AWS Glue Version for Python Jobs

Compute

Cloud Provider

AWS

Service Name

AWS Glue

Inefficiency Type

Outdated Runtime Version

Newer AWS Glue versions—such as Glue 5.0—include significant performance optimizations for **Python-based** ETL jobs, often reducing runtime by 10–60%. These improvements do not require any code changes, making version upgrades a simple and impactful optimization. When jobs remain on older runtimes such as Glue 3.0 or 4.0, they execute more slowly, consume more DPUs, and incur unnecessary cost. Additionally, Glue 5.0 offers more worker types (larger standard workers and memory-optimized workers), that can provide additional performance gain for some jobs. This inefficiency does not apply to Scala-based jobs, which do not benefit from the same performance uplift.

learn more

Suboptimal Storage for Logs

Yuval Goldstein

Other

Cloud Provider

GCP

Service Name

GCP Cloud Logging

Inefficiency Type

Misaligned Storage Destination

Many organizations retain all logs in Cloud Logging’s standard storage, even when the data is rarely queried or required only for audit or compliance. Logging buckets are priced for active access and are not optimized for low-frequency retrievas, results in unnecessary expense. Redirecting logs to BigQuery or Cloud Storage can provide better cost efficiency, particularly when coupled with lifecycle policies or table partitioning. Choosing the optimal storage destination based on access frequency and analytics needs is essential to control log retention costs.

learn more

Resources Generating Excessive INFO Logs

Yuval Goldstein

Other

Cloud Provider

GCP

Service Name

GCP Cloud Logging

Inefficiency Type

Excessive Log Verbosity

Some GCP services and workloads generate INFO-level logs at very high frequencies — for example, load balancers logging every HTTP request or GKE nodes logging system health messages. While valuable for debugging, these logs can flood Cloud Logging with non-critical data. Without log-level tuning or exclusion filters, organizations incur continuous ingestion charges for messages that are seldom analyzed. Over time, this behavior compounds into a persistent waste driver across large-scale environments.

learn more

Logging Buckets in Non-Production Environments Storing Info Logs

Yuval Goldstein

Other

Cloud Provider

GCP

Service Name

GCP Cloud Logging

Inefficiency Type

Excessive Ingestion of Low-Value Logs

Non-production environments frequently generate INFO-level logs that capture expected system behavior or routine API calls. While useful for troubleshooting in development, they rarely need to be retained. Allowing all INFO logs to be ingested and stored in Logging buckets across dev or staging environments can lead to disproportionate ingestion and storage costs. This inefficiency often persists because log routing and severity filters are not differentiated between production and non-production projects.

learn more

Duplicate Storage of Logs in Cloud Logging

Yuval Goldstein

Other

Cloud Provider

GCP

Service Name

GCP Cloud Logging

Inefficiency Type

Redundant Log Routing Configuration

Duplicate log storage occurs when multiple sinks capture the same log data — for example, organization-wide sinks exporting all logs to Cloud Storage and project-level sinks doing the same. This redundancy results in paying twice (or more) for identical data. It often arises from decentralized logging configurations, inherited policies, or unclear ownership between teams. The problem is compounded when logs are routed both to Cloud Logging and external observability platforms, creating parallel ingestion streams and double billing.

learn more

Azure Hybrid Benefit Not Enabled on SQL Databases

Databases

Cloud Provider

Azure

Service Name

Azure SQL

Inefficiency Type

Licensing Configuration Gap

Azure Hybrid Benefit allows organizations to apply existing SQL Server licenses with Software Assurance or qualifying subscriptions to Azure SQL Databases. When this configuration is missed or not enforced, workloads continue to incur license-inclusive costs despite license ownership. This oversight often occurs in environments where licensing governance is decentralized or when databases are provisioned manually without applying existing entitlements. Across multiple databases or elastic pools, these duplicated license costs can accumulate substantially over time.

learn more

Azure Hybrid Benefit Not Enabled on Virtual Machines

Compute

Cloud Provider

Azure

Service Name

Azure Virtual Machines

Inefficiency Type

Licensing Configuration Gap

Many organizations purchase Software Assurance or subscription-based Windows and SQL Server licenses that entitle them to use Azure Hybrid Benefit. However, if the setting is not applied on eligible resources, Azure continues charging pay-as-you-go rates that already include Microsoft licensing costs. This oversight results in paying twice—once for the on-premises license and once for the built-in Azure license. The inefficiency often goes unnoticed because licensing configurations are not centrally validated or enforced. Enabling AHUB can reduce costs by up to 40% for Windows server VMs and up to 30% for SQL Databases.

learn more

Idle Dataflow Workers Running After Pipeline Failure

Damian Ohienmhen

Compute

Cloud Provider

GCP

Service Name

GCP Dataflow

Inefficiency Type

Unreleased Compute Resources After Failure

When a Dataflow pipeline fails—often due to dependency issues, misconfigurations, or data format mismatches—its worker instances may remain active temporarily until the service terminates them. In some cases, misconfigured jobs, stuck retries, or delayed monitoring can cause workers to continue running for extended periods. These idle workers consume vCPU, memory, and storage resources without performing useful work. The inefficiency is compounded in large or high-frequency batch environments where repeated failures can leave many orphaned workers running concurrently.

learn more

Misuse of Aurora Serverless for Steady-State Workloads

Cristian Măgherușan-Stanciu

Databases

Cloud Provider

AWS

Service Name

AWS Aurora

Inefficiency Type

Suboptimal Deployment Model

Aurora Serverless is designed for workloads with unpredictable or intermittent usage patterns that benefit from automatic scaling. However, when used for databases with constant load, the service’s elasticity offers little advantage and adds cost overhead. Serverless instances run continuously in steady workloads, resulting in persistent ACU billing at a higher effective rate than a provisioned cluster of similar size. In addition, Serverless configurations cannot use Reserved Instances or Savings Plans, missing out on predictable cost reductions available to provisioned Aurora.

learn more

Pipeline Breaks from Outdated Dependency Images in Dataflow

Damian Ohienmhen

Compute

Cloud Provider

GCP

Service Name

GCP Dataflow

Inefficiency Type

Operational Overhead from Custom Image Maintenance

In restricted or isolated network environments, Dataflow workers often cannot reach the public internet to download runtime dependencies. To operate securely, organizations build custom worker images that bundle required libraries. However, these images must be manually updated to keep dependencies current. As upstream packages evolve, outdated internal images can cause pipeline errors, execution delays, or total job failures. Each failure wastes worker runtime, increases troubleshooting time, and leads to rebuild cycles that inflate operational and compute costs.

learn more

Outdated Aurora Versions Triggering Extended Support Charges

Dhara Kansagara

Databases

Cloud Provider

AWS

Service Name

AWS Aurora

Inefficiency Type

Outdated Engine Version

Customers often delay upgrading Aurora clusters due to compatibility concerns or operational overhead. However, when older versions such as MySQL 5.7 or PostgreSQL 11 move into Extended Support, AWS applies automatic surcharges to ensure continued patching. These charges affect all clusters regardless of usage, creating unnecessary cost exposure across both production and non-production environments. For large Aurora fleets, the incremental expense can become significant if upgrades are not proactively managed.

learn more

Outdated RDS Versions Triggering Extended Support Charges

Dhara Kansagara

Databases

Cloud Provider

AWS

Service Name

AWS RDS

Inefficiency Type

Outdated Engine Version

Many organizations continue to run outdated database engines, such as MySQL 5.7 or PostgreSQL 11, beyond their support windows. Beginning in 2024, AWS automatically enrolls these into Extended Support to maintain security updates, adding incremental charges that scale with vCPU count. These costs often appear suddenly, impacting both production and non-production environments. For development and test databases in particular, the charges may outweigh their value, leading to hidden inefficiencies if not addressed promptly.

learn more

Unnecessary Costs from Unused Lambda Versions with SnapStart

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Version Sprawl

Many teams publish new Lambda versions frequently (e.g., through CI/CD pipelines) but do not clean up old ones. When SnapStart is enabled, each of these versions retains an active snapshot in the cache, generating ongoing charges. Over time, accumulated unused versions can significantly increase spend without delivering any business value. This problem compounds in environments with high deployment velocity or many functions.

learn more

Inefficient SnapStart Configuration in Lambda

Compute

Cloud Provider

AWS

Service Name

AWS Lambda

Inefficiency Type

Misconfigured Performance Optimization

SnapStart reduces cold-start latency, but when configured inefficiently, it can increase costs. High-traffic workloads can trigger frequent snapshot restorations, multiplying costs. Slow initialization code inflates the Init phase, which is now billed at the full rate. Suppressed-init conditions, where functions initialize without enhanced resources, can add further inefficiency if memory or timeout settings are misaligned. Together, these factors can cause SnapStart to deliver higher spend without proportional benefit.

learn more

Unexpired Non-Current Object Versions in S3

Brendan McFarland

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Missing Lifecycle Policy

When S3 versioning is enabled but no lifecycle rules are defined for non-current objects, outdated versions accumulate indefinitely. These non-current versions are rarely accessed but continue to incur storage charges. Over time, this leads to significant hidden costs, particularly in buckets with frequent object updates or automated data pipelines. Proper lifecycle management is required to limit or expire obsolete versions.

learn more

Suboptimal Use of EFS Storage Classes

Amay Chandravanshi

Storage

Cloud Provider

AWS

Service Name

AWS EFS

Inefficiency Type

Misaligned Storage Tiering

Many organizations default to storing all EFS data in the Standard class, regardless of how frequently data is accessed. This results in inefficient spend for workloads with significant portions of data that are rarely read. EFS IA and Archive tiers offer lower-cost alternatives for data with low or near-zero access, while Intelligent Tiering can automate placement decisions. Failing to leverage these options wastes storage spend and reduces cost efficiency.

learn more

Excessive AWS Config Costs from Spot Instances

Abdeldjallil Koutchoukali

Other

Cloud Provider

AWS

Service Name

AWS Config

Inefficiency Type

Over-Recording of Ephemeral Resources

Spot Instances are designed to be short-lived, with frequent interruptions and replacements. When AWS Config continuously records every lifecycle change for these instances, it produces a large number of CIRs. This drives costs significantly higher without delivering meaningful compliance insight, since Spot Instances are typically stateless and non-critical. In environments with heavy Spot usage, Config costs can balloon and exceed the value of tracking these transient resources.

learn more

Unmanaged Growth of Athena Query Output Buckets

Abdeldjallil Koutchoukali

Compute

Cloud Provider

AWS

Service Name

AWS Athena

Inefficiency Type

Missing Lifecycle Policy

Athena generates a new S3 object for every query result, regardless of whether the output is needed long term. Over time, this leads to uncontrolled growth of the output bucket, especially in environments with repetitive queries such as cost and usage reporting. Many of these files are transient and provide little value once the query is consumed. Without lifecycle rules, organizations pay for unnecessary storage and create clutter in S3.

learn more

Continuous AWS Config Recording in Non-Production Environments

Jérémy Nancel

Other

Cloud Provider

AWS

Service Name

AWS Config

Inefficiency Type

Excessive Recording Frequency

By default, AWS Config is enabled in continuous recording mode. While this may be justified for production workloads where detailed auditability is critical, it is rarely necessary in non-production environments. Frequent changes in development or testing environments — such as redeploying Lambda functions, ECS tasks, or EC2 instances — generate large volumes of CIRs. This results in disproportionately high costs with minimal benefit to governance or compliance. Switching non-production environments to daily recording reduces CIR volume significantly while maintaining sufficient visibility for tracking changes.

learn more

Unnecessary Default Log Retention in Datadog

Jérémy Nancel

Other

Cloud Provider

Datadog

Service Name

Inefficiency Type

Excessive Retention Configuration

Many organizations keep Datadog’s default log retention settings without evaluating business requirements. Defaults may extend retention far beyond what is useful for troubleshooting, performance monitoring, or compliance. This leads to unnecessary storage and indexing costs, particularly in non-production environments or for logs with limited value after a short period. By adjusting retention per project, environment, or service, organizations can reduce spend while still meeting compliance and operational needs.

learn more

Suboptimal Use of Intel-Based Instances in OpenSearch

Jérémy Nancel

Other

Cloud Provider

AWS

Service Name

AWS OpenSearch

Inefficiency Type

Suboptimal Instance Selection

AWS Graviton processors are designed to deliver better price-performance than comparable Intel-based instances, often reducing cost by 20–30% at equivalent workload performance. OpenSearch domains running on older Intel-based families consume more spend without providing additional capability. Since Graviton-powered instance types are functionally identical in features and performance for OpenSearch, continuing to run on Intel-based clusters represents unnecessary inefficiency.

learn more

Inefficient Use of Job Clusters in Databricks Workflows

Matt Weingarten

Other

Cloud Provider

Databricks

Service Name

Databricks Workflows

Inefficiency Type

Suboptimal Cluster Configuration

When multiple tasks within a workflow are executed on separate job clusters — despite having similar compute requirements — organizations incur unnecessary overhead. Each cluster must initialize independently, adding latency and cost. This results in inefficient resource usage, especially for workflows that could reuse the same cluster across tasks. Consolidating tasks onto a single job cluster where feasible reduces start-up time and avoids duplicative compute charges.

learn more

Billing Account Migration Creating Emergency List-Price Purchases in Google Cloud Marketplace

Alexa Abbruscato

Other

Cloud Provider

GCP

Service Name

Inefficiency Type

Subscription Disruption Due to Billing Migration

Changing a Google Cloud billing account can unintentionally break existing Marketplace subscriptions. If entitlements are tied to the original billing account, the subscription may fail or become invalid, prompting teams to make urgent, direct purchases of the same services, often at higher list or on-demand rates. These emergency purchases bypass previously negotiated Marketplace pricing and can result in significantly higher short-term costs. The issue is common during reorganizations, mergers, or changes to billing hierarchy and is often not discovered until after costs have spiked.

learn more

Lifecycle Visibility Gaps Inflating Renewal Costs in Azure Marketplace

Alexa Abbruscato

Other

Cloud Provider

Azure

Service Name

Azure Marketplace

Inefficiency Type

Contract Lifecycle Mismanagement

When Marketplace contracts or subscriptions expire or change without visibility, Azure may automatically continue billing at higher on-demand or list prices. These lapses often go unnoticed due to lack of proactive tracking, ownership, or renewal alerts, resulting in substantial cost increases. The issue is amplified when contract records are siloed across procurement, finance, and engineering teams, with no centralized mechanism to monitor entitlement status or reconcile expected versus actual billing.

learn more

Hidden Marketplace Spend Preventing Commitment Optimization

Alexa Abbruscato

Other

Cloud Provider

AWS

Service Name

AWS Marketplace

Inefficiency Type

Commitment Misalignment

In many organizations, AWS Marketplace purchases are lumped into a single consolidated billing line without visibility into individual vendors. This lack of transparency makes it difficult to identify which Marketplace spend is eligible to count toward the EDP cap. As a result, teams may either overspend on direct AWS services to fulfill their commitment unnecessarily or miss the opportunity to right-size new commitments based on existing Marketplace purchases. In both cases, the absence of vendor-level detail hinders optimization.

learn more

Transactable vs. Non-Transactable Confusion in Azure Marketplace

Alexa Abbruscato

Other

Cloud Provider

Azure

Service Name

Azure Marketplace

Inefficiency Type

Commitment Misalignment

Azure Marketplace offers two types of listings: transactable and non-transactable. Only transactable purchases contribute toward a customer’s MACC commitment. However, many teams mistakenly assume that all Marketplace spend counts, leading to missed opportunities to burn down commitments and risking budget inefficiencies. Selecting a non-transactable listing, when a transactable equivalent exists, can result in identical services being acquired at higher effective cost due to lost discounts. This confusion is exacerbated when procurement and engineering teams do not coordinate or consult Microsoft's guidance.

learn more

Double Counting on EDP Commitments

Alexa Abbruscato

Other

Cloud Provider

AWS

Service Name

AWS Marketplace

Inefficiency Type

Commitment Misalignment

Many organizations mistakenly believe that all AWS Marketplace spend automatically contributes to their EDP commitment. In reality, only certain Marketplace transactions, those involving EDP-eligible vendors and transactable SKUs, will count towards a portion of their EDP commitment. This misunderstanding can lead to double counting: forecasting based on the assumption that both native AWS usage and Marketplace purchases will fully draw down the commitment. If the assumptions are incorrect, the organization risks failing to meet its EDP threshold, incurring penalties or losing expected discounts.

learn more

Unnecessarily High Recording Granularity in AWS Config

Other

Cloud Provider

AWS

Service Name

AWS Config

Inefficiency Type

Suboptimal Recording Configuration

Organizations frequently inherit continuous recording by default (e.g., through landing zones) without validating the business need for per-change granularity across all resource types and environments. In change-heavy accounts (ephemeral resources, CI/CD churn, autoscaling), continuous mode drives very high CIR volumes with limited additional operational value. Selecting periodic recording for lower-risk resource types and/or non-production environments can maintain necessary visibility while reducing CIR volume and cost. Recorder settings are account/region scoped, so you can apply continuous in production where required and periodic elsewhere.

learn more

Suboptimal Architecture Selection in AWS Fargate

Compute

Cloud Provider

AWS

Service Name

AWS Fargate

Inefficiency Type

Suboptimal Architecture Selection

AWS Fargate supports both x86 and Graviton2 (ARM64) CPU architectures, but by default, many workloads continue to run on x86. Graviton2 delivers significantly better price-performance, especially for stateless, scale-out container workloads. Teams that fail to configure task definitions with the `ARM64` architecture miss out on meaningful efficiency gains. Because this setting is not enabled automatically and is often overlooked, it results in higher compute costs for functionally equivalent workloads.

learn more

Excessive KMS Charges from Missing S3 Bucket Key Configuration

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Misconfiguration

S3 buckets configured with SSE-KMS but without Bucket Keys generate a separate KMS request for each object operation. This behavior results in disproportionately high KMS request costs for data-intensive workloads such as analytics, backups, or frequently accessed objects. Bucket Keys allow S3 to cache KMS data keys at the bucket level, reducing the volume of KMS calls and cutting encryption costs—often with no impact on security or performance.

learn more

No inefficiencies match the current search.

1

of

3