Underutilized Kubernetes Workload

Service Category

Compute

Cloud Provider

AWS

Service Name

AWS EKS

Inefficiency Type

Underutilization

Explanation

When Kubernetes workloads request more CPU and memory than they actually consume, nodes must reserve capacity that remains unused. This leads to lower node density, forcing the cluster to maintain more instances than necessary. Aligning resource requests with observed utilization improves cluster efficiency and reduces compute spend without sacrificing application performance.

Relevant Billing Model

EKS control planes are billed per hour, while compute nodes are billed based on EC2 or Fargate capacity. Kubernetes resource "requests" for CPU and memory influence scheduling decisions and cluster resource allocation. Overprovisioned requests lead to inefficient node usage and inflated infrastructure costs.

Detection

Identify workloads where average CPU and memory usage are consistently much lower than requested values
Analyze container-level metrics to assess request-to-usage ratios over time
Leverage Vertical Pod Autoscaler recommendations, if available, to identify right-sizing opportunities
Evaluate whether overprovisioning was intentional for specific performance or reliability requirements
Validate proposed adjustments with application owners or SRE teams to ensure they meet operational needs

Remediation

Update the CPU and memory requests for underutilized workloads to better match observed usage patterns. Apply changes through updated Kubernetes manifests or infrastructure-as-code pipelines. Monitor workloads after adjustment to confirm that performance and reliability remain within acceptable thresholds. Establish regular right-sizing reviews to keep resource allocations efficient as workloads evolve.

Relevant Documentation

Amazon EKS Documentation
Kubernetes Resource Management

Submit Feedback