Azure NetApp Files bills based on provisioned capacity pool size — not on the actual data stored within volumes. This means that when a capacity pool is provisioned at a size significantly larger than the sum of volume quotas allocated within it, the organization pays for stranded, unallocated capacity every hour. For example, a 10 TiB capacity pool with only 6 TiB of volume quotas allocated has 4 TiB of capacity that generates cost but serves no purpose.
This overprovisioning commonly occurs for several reasons. Capacity pools do not automatically shrink — since April 2021, pool sizing is entirely a manual customer responsibility. When volumes are deleted, the freed capacity remains in the pool unless an administrator explicitly resizes it downward. Additionally, with auto QoS pools, volume quotas directly determine throughput performance, which incentivizes teams to set larger quotas than their data requires, further inflating pool sizes. Over time, these dynamics create a growing gap between provisioned pool capacity and what is actually needed, resulting in persistent, avoidable charges that compound across multiple pools and regions.
In November 2025, AWS introduced an Archive storage class for private ECR repositories, marketed as a way to reduce storage costs for large volumes of rarely used container images. However, Archive storage pricing is identical to Standard storage pricing for the first 150 TB per month. Below this threshold, Archive provides no storage savings yet introduces a per-gigabyte retrieval charge, a retrieval delay of up to 20 minutes, and a 90-day minimum storage duration. Adopting the Archive storage class before meeting the 150 TB threshold means paying the same storage price but taking on additional fees and operational overhead.
This inefficiency is easy to miss because the AWS announcement emphasized cost savings for "large volumes" without quantifying "large" or prominently disclosing the retrieval charge or the minimum storage duration. In other AWS services, optional storage classes typically offer a storage price reduction from the first byte, in exchange for access penalties. With ECR, however, access penalties apply as described, but the storage price is unchanged for the first 150 TB, a container storage volume that few organizations achieve.
Organizations often use the Standard - Infrequent Access (Standard-IA) storage class based on documentation and code that predate 2021 updates to the Intelligent Tiering storage class. Intelligent Tiering became suitable as an initial S3 storage class even for objects that are small and/or will be deleted early. It also gained a heavily-discounted access tier. Older internal runbooks, lifecycle policies (including ones specified in infrastructure-as-code templates), scripts, programs, and public examples may still default to Standard-IA, inflating storage costs.
This inefficiency report compares Standard-IA with Intelligent Tiering. It is not intended to cover other storage classes. S3 storage is billed per gibibyte or GiB (powers of 2) rather than per gigabyte or GB (powers of 10), which matters for small objects and also for large volumes of storage.
Relative to the Standard storage class, the Standard-IA storage class offers a moderate, constant storage price discount but imposes a minimum billable object size of 128 KiB, a minimum storage duration of 30 days, and a per-GiB retrieval charge.
In contrast, AWS updated the Intelligent Tiering storage class in September, 2021, eliminating the minimum storage duration and exempting small objects from a monthly per-object monitoring and automation charge. Intelligent Tiering never had retrieval charges. In November, 2021, AWS added the heavily-discounted Archive Instant Access tier.
For objects stored beyond a few months, Intelligent Tiering's progressive storage price discounts surpass Standard-IA's constant discount. Storage savings accumulate each month. Objects in the Intelligent Tiering storage class automatically move through progressively cheaper access tiers unless the objects are accessed. Intelligent Tiering also avoids Standard-IA's minimum billable object size and minimum storage duration penalties.
When external Delta tables are dropped from Databricks Unity Catalog or the legacy Hive metastore, only the table metadata is removed — the underlying data files in cloud object storage (such as S3, ADLS, or GCS) remain untouched and continue to incur per-GB-month storage charges. This behavior is by design: external tables decouple metadata from data lifecycle management, meaning Databricks explicitly does not delete the underlying storage when an external table is dropped. The result is orphaned storage — files that no longer have any catalog reference, are not consumed by any downstream pipeline, and deliver no business value, yet continue to accumulate charges indefinitely.
This pattern is particularly prevalent in environments using medallion architecture (bronze/silver/gold layers), where tables are frequently recreated during pipeline evolution, schema experimentation, or migration between environments. Development and test workloads compound the problem, as teams routinely create and abandon external table references without cleaning up the associated storage. Unlike managed tables in Unity Catalog — which have a retention period with recovery capability before automatic deletion — external tables offer no such safety net. The orphaned storage is structurally invisible to standard cost dashboards because it appears as generic object storage charges, not as Databricks-specific line items. Over time, this silent accumulation can represent a meaningful share of an organization's total storage spend.
Importantly, Databricks VACUUM operations do not address this pattern. VACUUM cleans up old file versions within active Delta tables, but it cannot act on storage paths that have been completely disconnected from catalog metadata through external table drops. The only way to reclaim this storage is to manually identify and delete the orphaned files in cloud storage.
This inefficiency occurs when backup data remains in a Recovery Services Vault after the original protected resource has been deleted. These orphaned backups continue to consume storage and generate cost despite no longer supporting an active workload. In addition, long-retained backups that are rarely accessed are often kept in higher-cost tiers, increasing storage spend without providing additional value.
This inefficiency occurs when backup data persists longer than intended due to misaligned or outdated retention policies. It often arises when retention requirements change over time, but older recovery points are not evaluated or cleaned up accordingly. In some cases, manually configured backups or legacy policies remain in place even after operational or compliance needs have been reduced.
As a result, backup storage continues to grow and incur cost without delivering additional recovery value.
This inefficiency occurs when a protected resource (such as a virtual machine, database, or file share) is decommissioned without explicitly stopping backup protection. In these cases, Azure Backup continues to retain existing recovery points in the vault until the retention policy expires. Although the source resource no longer exists, backup storage remains allocated and billable, resulting in unnecessary ongoing costs.
This pattern is common when infrastructure is deleted outside of a formal decommissioning process or when backup ownership is unclear.
When S3 versioning is enabled but no lifecycle rules are defined for non-current objects, outdated versions accumulate indefinitely. These non-current versions are rarely accessed but continue to incur storage charges. Over time, this leads to significant hidden costs, particularly in buckets with frequent object updates or automated data pipelines. Proper lifecycle management is required to limit or expire obsolete versions.
Many organizations default to storing all EFS data in the Standard class, regardless of how frequently data is accessed. This results in inefficient spend for workloads with significant portions of data that are rarely read. EFS IA and Archive tiers offer lower-cost alternatives for data with low or near-zero access, while Intelligent Tiering can automate placement decisions. Failing to leverage these options wastes storage spend and reduces cost efficiency.
S3 buckets configured with SSE-KMS but without Bucket Keys generate a separate KMS request for each object operation. This behavior results in disproportionately high KMS request costs for data-intensive workloads such as analytics, backups, or frequently accessed objects. Bucket Keys allow S3 to cache KMS data keys at the bucket level, reducing the volume of KMS calls and cutting encryption costs—often with no impact on security or performance.