Submit feedback on
Overprovisioned Memory Allocation in Cloud Run Services
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Overprovisioned Memory Allocation in Cloud Run Services
Service Category
Compute
Cloud Provider
GCP
Service Name
GCP Cloud Run
Inefficiency Type
Overprovisioned Resource Allocation
Explanation

In Cloud Run, each revision is deployed with a fixed memory allocation (e.g., 512MiB, 1GiB, 2GiB, etc.). These settings are often overestimated during initial development or copied from templates. Unlike auto-scaling platforms that adapt instance size based on workload, Cloud Run continues to bill per the allocated amount regardless of actual memory used during execution. If a service consistently uses significantly less memory than allocated, it results in avoidable overpayment per request — especially for high-throughput or long-running services. Since memory and CPU are billed together based on configured values, this inefficiency compounds quickly at scale.

Relevant Billing Model

Billed based on: * Allocated vCPU and memory (GiB) per request * Duration of request execution (per 100ms increment) * Number of requests * Additional charges for egress and requests beyond free tier

Detection
  • Review actual memory usage per request over a representative time window
  • Identify services with consistently low memory utilization relative to their configured limits
  • Evaluate whether higher memory tiers were chosen to solve startup latency or cold start issues that no longer apply
  • Cross-reference high-throughput services where per-request efficiency has significant cost impact
Remediation
  • Reconfigure services with right-sized memory allocations aligned to observed usage patterns
  • Test progressively smaller memory configurations to find a stable baseline without introducing latency or OOM errors
  • Implement monitoring for memory pressure or failures to validate new settings
  • Use performance benchmarks and load tests in lower environments before promoting configuration changes to production
Submit Feedback