Submit feedback on
Excessive S3 Request Costs from Filesystem-Oriented I/O Patterns
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Excessive S3 Request Costs from Filesystem-Oriented I/O Patterns
Ariel Lichterman
CER:

CER-0334

Service Category
Storage
Cloud Provider
AWS
Service Name
AWS S3
Inefficiency Type
Inefficient Architecture
Explanation

Amazon S3 bills for every API request — LIST, HEAD, GET, PUT, COPY, POST, and DELETE — independently of storage charges. Workloads originally designed for locally-provisioned storage, where listing a directory, checking a file's existence, or writing a single record is effectively free, carry those assumptions into S3 and convert each step into a billed HTTP request. At scale, request costs can rival or even exceed storage costs, yet they are routinely overlooked by cost-optimization efforts that focus on storage class selection and data transfer.

The waste manifests on both sides of the I/O path. On the read and coordination side, applications generate LIST and HEAD storms: legacy commit protocols recursively list output directories to discover what tasks wrote, query engines re-enumerate partitions on every execution, and consumers poll a prefix on a timer to detect new data. On the write side, metrics, event, and log pipelines issue one small PUT per record instead of buffering and flushing in batches, so PUT volume scales linearly with input rate. Rename-based commit protocols compound the problem because S3 has no native rename — each rename is implemented as a COPY followed by a DELETE, doubling the request count per output file.

The root cause is an architectural mismatch: the application treats S3 as a filesystem it can list, stat, and rename cheaply, when S3 is an object store that charges per HTTP call. Fixing the problem requires shifting coordination, state tracking, and batching into the application layer so that S3 serves bytes rather than acting as a coordination mechanism.

Relevant Billing Model

S3 request costs are billed per 1,000 operations, with rates that vary by operation type:

Several characteristics of this billing model amplify the cost of filesystem-oriented workloads:

  • Request charges apply regardless of payload size — a PUT of a single-byte object costs the same as a PUT of a multi-gigabyte object
  • Failed requests (including 404 and 403 responses) are billed at the same rate as successful requests, so existence checks via HEAD against non-existent keys still incur charges
  • S3 has no native rename operation; a rename is implemented as a COPY plus a DELETE, each billed separately
  • Storage is billed per GB-month, so a bucket with many tiny objects and high API call volume can easily spend more on requests than on storage
Detection
  • Identify buckets where monthly request-tier costs approach or exceed monthly storage costs, indicating that API call volume — not data volume — is the dominant cost driver
  • Review the ratio of monthly spend to logical bytes stored for each bucket; buckets where this ratio is materially above the headline storage rate warrant investigation
  • Assess which API operation types dominate request volume on candidate buckets — distinguish between LIST/HEAD-heavy patterns (read-side coordination overhead), high-frequency small PUT patterns (write-side per-record ingest), and COPY/DELETE pairs (rename-based commit overhead)
  • Evaluate average object size on write-heavy buckets; consistently small objects well below a block-friendly size suggest unbatched per-record writes
  • Examine whether applications discover data by enumerating prefixes at query or commit time rather than consulting an application-owned manifest, metastore, or table-format index
  • Confirm whether consumers detect new data by periodically listing a prefix rather than reacting to write-side events or notifications
  • Review commit protocols used by data processing frameworks to determine whether request count scales with the number of output files rather than the number of logical commits
Remediation
  • Audit the application's storage-interaction model across its read, write, and commit paths — identify every component that discovers state by listing, checks existence per-object, writes per-record, or commits by rename, and rank these surfaces by request-cost contribution
  • Re-architect the write path around batching: introduce a buffering layer — in-process or via an upstream stream — sized by time and bytes so that the unit of work reaching S3 is a consolidated block rather than an individual record
  • Re-architect the read and coordination path around an application-owned manifest or index (such as a table-format catalog or metastore) so that discovery, existence checks, and partition enumeration are answered from cached state rather than from LIST and HEAD calls against S3
  • Replace rename-based commit protocols with commit mechanisms whose request count is bounded by the number of logical commits, not the number of output files — purpose-built S3 committers that leverage multipart upload avoid the copy-then-delete overhead entirely
  • Switch consumers from prefix-polling to event-driven notification so that new-data detection does not require periodic LIST calls
  • Establish compaction as a scheduled workload with explicit targets for file count per prefix, ensuring that any residual listing is inexpensive and downstream readers issue fewer GET requests
Submit Feedback