Storage

Underutilized Azure Managed Disk Reservations

Storage

Cloud Provider

Azure

Service Name

Azure Managed Disks

Inefficiency Type

Suboptimal Pricing Model

Azure Managed Disk reservations allow organizations to pre-purchase Premium SSD capacity at a discounted rate by committing to a one-year term. However, these reservations operate on a strict use-it-or-lose-it basis — if reserved disk capacity is not matched by provisioned disks in a given hour, that hour's reservation benefit is permanently lost and does not carry forward. This means that any mismatch between reserved quantities and actual disk deployment directly erodes the value of the commitment. Organizations commonly encounter this waste when workloads are decommissioned, migrated to different disk SKUs, or moved to different regions after the reservation was purchased.

A critical nuance is that disk reservations are purchased by specific SKU (such as P30 or P40), not by aggregate capacity. A P40 reservation cannot be applied to P30 disk usage, even though both are Premium SSDs. This SKU-level rigidity creates a significant mismatch risk: if an organization resizes disks or shifts workloads to a different tier, the original reservation provides zero benefit. Combined with the relatively modest discount that disk reservations offer compared to other Azure reservation types, even a small amount of underutilization can quickly eliminate any savings and turn the reservation into a net cost increase.

The cost impact compounds over time. Because unused reservation hours are permanently lost, an organization paying for reservations that consistently go partially or fully unused is effectively paying more than it would under standard pay-as-you-go pricing — the worst possible outcome for a commitment designed to save money.

Learn more

Excessive API Request Cost Overhead from I/O-Intensive Workloads

Storage

Cloud Provider

GCP

Service Name

GCP GCS

Inefficiency Type

Inefficient Architecture

Google Cloud Storage exposes a flat namespace through an HTTP API, but many workloads consume it through filesystem-style abstractions — FUSE mounts, Hadoop/Spark connectors, or analytic engines that enumerate prefixes to discover state. This mismatch creates a hidden cost multiplier: every directory listing translates into list-object API calls, every metadata check becomes a HEAD request, and every file rename becomes a copy-then-delete sequence. Each of these is individually metered as a Class A or Class B operation, and the charges accumulate rapidly when the access pattern is I/O-intensive. Because the filesystem abstraction hides the per-call billing model from the application, developers often have no visibility into the volume of paid operations their code generates.

The problem manifests on both the read and write sides. On the read and coordination side, applications enumerate prefixes to discover partitions, issue per-object metadata calls on hot paths, and poll prefixes on timers instead of subscribing to notifications — all generating high volumes of list and metadata operations. On the write side, ingest paths that create one object per record (metrics, logs, events) produce a flood of insert operations instead of fewer, larger uploads carrying the same bytes. Commit workflows that use rename-by-copy-delete further multiply operation counts in proportion to the number of output files rather than the number of logical commits.

The cost impact can be substantial. In workloads generating millions of small objects or frequent list operations, operation charges can rival or exceed the underlying storage costs — a clear signal that the workload's contract with object storage needs to change. The well-architected pattern is to push state management, ordering, and batching into the application layer so that Cloud Storage handles bytes, not filesystem semantics.

Learn more

Excessive Transaction Cost Overhead on Blob Storage from I/O-Intensive Workloads

Storage

Cloud Provider

Azure

Service Name

Azure Blob Storage

Inefficiency Type

Inefficient Architecture

Azure Blob Storage and ADLS Gen2 bill per transaction — every list, read, write, rename, and metadata operation is a separately metered API call. When organizations migrate workloads from on-premises Hadoop/HDFS environments or local filesystems, the ADLS Gen2 hierarchical namespace and its filesystem-like API make the transition feel seamless. But this abstraction masks a fundamental shift: what was a local or cluster-internal filesystem call is now a billed HTTP transaction. Applications that port their filesystem habits — recursive directory listings to discover state, per-file existence checks on hot paths, rename-based commit protocols, and per-record writes from telemetry pipelines — generate transaction volumes that can rival or exceed the cost of storing the data itself.

The problem is especially acute in big data analytics workloads. Spark and Hive jobs using legacy commit protocols issue large numbers of list and metadata operations at commit time, scaling with the number of output files rather than the number of logical commits. Telemetry, log, and event-ingest pipelines that write one blob per record create a parallel storm on the write side. Meanwhile, consumers that poll containers on a timer to detect new data add further list operations. The hierarchical namespace makes directory renames atomic — a genuine improvement over flat blob storage — but it does not make discovery free, and it does nothing to reduce the cost of unbatched writes. Transaction costs for hierarchical namespace accounts also carry an uplift compared to flat namespace accounts, compounding the expense.

The well-architected pattern is to own the metadata and the batching in the application layer — through table formats, manifests, metastores, or event-driven architectures — so the storage account serves bytes, not state queries and per-record overhead. Without this shift, transaction costs can quietly become the dominant line item on a storage account bill.

Learn more

Excessive S3 Request Costs from Filesystem-Oriented I/O Patterns

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Inefficient Architecture

Amazon S3 bills for every API request — LIST, HEAD, GET, PUT, COPY, POST, and DELETE — independently of storage charges. Workloads originally designed for locally-provisioned storage, where listing a directory, checking a file's existence, or writing a single record is effectively free, carry those assumptions into S3 and convert each step into a billed HTTP request. At scale, request costs can rival or even exceed storage costs, yet they are routinely overlooked by cost-optimization efforts that focus on storage class selection and data transfer.

The waste manifests on both sides of the I/O path. On the read and coordination side, applications generate LIST and HEAD storms: legacy commit protocols recursively list output directories to discover what tasks wrote, query engines re-enumerate partitions on every execution, and consumers poll a prefix on a timer to detect new data. On the write side, metrics, event, and log pipelines issue one small PUT per record instead of buffering and flushing in batches, so PUT volume scales linearly with input rate. Rename-based commit protocols compound the problem because S3 has no native rename — each rename is implemented as a COPY followed by a DELETE, doubling the request count per output file.

The root cause is an architectural mismatch: the application treats S3 as a filesystem it can list, stat, and rename cheaply, when S3 is an object store that charges per HTTP call. Fixing the problem requires shifting coordination, state tracking, and batching into the application layer so that S3 serves bytes rather than acting as a coordination mechanism.

Learn more

Infrequently Accessed Data Retained on High-Performance FSx File Systems

Storage

Cloud Provider

AWS

Service Name

Amazon FSx

Inefficiency Type

Inefficient Configuration

Amazon FSx file systems are designed for performance-sensitive workloads such as shared enterprise file systems, high-performance computing, analytics, and machine learning. Storage costs are driven by provisioned capacity (measured in GB-months) and throughput capacity (measured in MBps-months), regardless of how frequently the stored data is actually accessed. When datasets become archival, historical, or reference-only in nature — often after project completion, workload migration, or data lifecycle changes — retaining them on high-performance FSx storage results in sustained premium charges for data that could reside on significantly cheaper alternatives.

The severity of this inefficiency varies by FSx variant. FSx for Windows File Server is most directly exposed because it lacks native automatic tiering to external cold storage tiers — all data remains on provisioned SSD or HDD capacity with no built-in mechanism to move cold data to lower-cost object storage. FSx for NetApp ONTAP, by contrast, offers automatic data tiering to a lower-cost capacity pool tier, but this feature must be properly configured with appropriate tiering policies per volume; if left at default settings or misconfigured, cold data may still occupy expensive SSD storage. FSx for Lustre and FSx for OpenZFS support tiering storage classes that automatically move data between access tiers, but only when this storage class is selected at deployment. In all cases, the waste stems from the same root cause: high-performance storage capacity being consumed by data that no longer requires — or never required — that level of performance.

Learn more

Overprovisioned Azure NetApp Files Capacity Pools

Storage

Cloud Provider

Azure

Service Name

Azure NetApp Files

Inefficiency Type

Overprovisioned Resource

Azure NetApp Files bills based on provisioned capacity pool size — not on the actual data stored within volumes. This means that when a capacity pool is provisioned at a size significantly larger than the sum of volume quotas allocated within it, the organization pays for stranded, unallocated capacity every hour. For example, a 10 TiB capacity pool with only 6 TiB of volume quotas allocated has 4 TiB of capacity that generates cost but serves no purpose.

This overprovisioning commonly occurs for several reasons. Capacity pools do not automatically shrink — since April 2021, pool sizing is entirely a manual customer responsibility. When volumes are deleted, the freed capacity remains in the pool unless an administrator explicitly resizes it downward. Additionally, with auto QoS pools, volume quotas directly determine throughput performance, which incentivizes teams to set larger quotas than their data requires, further inflating pool sizes. Over time, these dynamics create a growing gap between provisioned pool capacity and what is actually needed, resulting in persistent, avoidable charges that compound across multiple pools and regions.

Learn more

ECR Archive Storage Class Used Below 150 TB Threshold

Storage

Cloud Provider

AWS

Service Name

AWS ECR

Inefficiency Type

Inefficient Configuration

In November 2025, AWS introduced an Archive storage class for private ECR repositories, marketed as a way to reduce storage costs for large volumes of rarely used container images. However, Archive storage pricing is identical to Standard storage pricing for the first 150 TB per month. Below this threshold, Archive provides no storage savings yet introduces a per-gigabyte retrieval charge, a retrieval delay of up to 20 minutes, and a 90-day minimum storage duration. Adopting the Archive storage class before meeting the 150 TB threshold means paying the same storage price but taking on additional fees and operational overhead.

This inefficiency is easy to miss because the AWS announcement emphasized cost savings for "large volumes" without quantifying "large" or prominently disclosing the retrieval charge or the minimum storage duration. In other AWS services, optional storage classes typically offer a storage price reduction from the first byte, in exchange for access penalties. With ECR, however, access penalties apply as described, but the storage price is unchanged for the first 150 TB, a container storage volume that few organizations achieve.

Learn more

S3 Standard - Infrequent Access Used Where Intelligent Tiering Would Be Cheaper

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Suboptimal Pricing Model

Organizations often use the Standard - Infrequent Access (Standard-IA) storage class based on documentation and code that predate 2021 updates to the Intelligent Tiering storage class. Intelligent Tiering became suitable as an initial S3 storage class even for objects that are small and/or will be deleted early. It also gained a heavily-discounted access tier. Older internal runbooks, lifecycle policies (including ones specified in infrastructure-as-code templates), scripts, programs, and public examples may still default to Standard-IA, inflating storage costs.

This inefficiency report compares Standard-IA with Intelligent Tiering. It is not intended to cover other storage classes. S3 storage is billed per gibibyte or GiB (powers of 2) rather than per gigabyte or GB (powers of 10), which matters for small objects and also for large volumes of storage.

Relative to the Standard storage class, the Standard-IA storage class offers a moderate, constant storage price discount but imposes a minimum billable object size of 128 KiB, a minimum storage duration of 30 days, and a per-GiB retrieval charge.

In contrast, AWS updated the Intelligent Tiering storage class in September, 2021, eliminating the minimum storage duration and exempting small objects from a monthly per-object monitoring and automation charge. Intelligent Tiering never had retrieval charges. In November, 2021, AWS added the heavily-discounted Archive Instant Access tier.

For objects stored beyond a few months, Intelligent Tiering's progressive storage price discounts surpass Standard-IA's constant discount. Storage savings accumulate each month. Objects in the Intelligent Tiering storage class automatically move through progressively cheaper access tiers unless the objects are accessed. Intelligent Tiering also avoids Standard-IA's minimum billable object size and minimum storage duration penalties.

Learn more

Orphaned Cloud Storage from Dropped External Delta Tables in Databricks

Storage

Cloud Provider

AWS

Service Name

AWS S3

Inefficiency Type

Unused Resource

When external Delta tables are dropped from Databricks Unity Catalog or the legacy Hive metastore, only the table metadata is removed — the underlying data files in cloud object storage (such as S3, ADLS, or GCS) remain untouched and continue to incur per-GB-month storage charges. This behavior is by design: external tables decouple metadata from data lifecycle management, meaning Databricks explicitly does not delete the underlying storage when an external table is dropped. The result is orphaned storage — files that no longer have any catalog reference, are not consumed by any downstream pipeline, and deliver no business value, yet continue to accumulate charges indefinitely.

This pattern is particularly prevalent in environments using medallion architecture (bronze/silver/gold layers), where tables are frequently recreated during pipeline evolution, schema experimentation, or migration between environments. Development and test workloads compound the problem, as teams routinely create and abandon external table references without cleaning up the associated storage. Unlike managed tables in Unity Catalog — which have a retention period with recovery capability before automatic deletion — external tables offer no such safety net. The orphaned storage is structurally invisible to standard cost dashboards because it appears as generic object storage charges, not as Databricks-specific line items. Over time, this silent accumulation can represent a meaningful share of an organization's total storage spend.

Importantly, Databricks VACUUM operations do not address this pattern. VACUUM cleans up old file versions within active Delta tables, but it cannot act on storage paths that have been completely disconnected from catalog metadata through external table drops. The only way to reclaim this storage is to manually identify and delete the orphaned files in cloud storage.

Learn more

Idle Recovery Services Vault Backups and Suboptimal Backup Storage Tiering

Storage

Cloud Provider

Azure

Service Name

Azure Recovery Services Vault

Inefficiency Type

Orphaned backup data and inefficient storage tiering

This inefficiency occurs when backup data remains in a Recovery Services Vault after the original protected resource has been deleted. These orphaned backups continue to consume storage and generate cost despite no longer supporting an active workload. In addition, long-retained backups that are rarely accessed are often kept in higher-cost tiers, increasing storage spend without providing additional value.

Learn more

There are no inefficiency matches the current filters.