Node pools provisioned with large or specialized VMs (e.g., high-memory, GPU-enabled, or compute-optimized) can be significantly overprovisioned relative to the actual pod requirements. If workloads consistently leave a large portion of resources unused (e.g., low CPU/memory request-to-capacity ratio), the organization incurs unnecessary compute spend. This often happens in early cluster design phases, after application demand shifts, or when teams allocate for peak usage without autoscaling.
Billed based on: * Underlying Compute Engine VMs in the node pool (vCPU, memory, and attached storage) * GPU usage (if applicable) * Operating system licensing (for premium OS options) Control plane is free for Autopilot clusters or billed per cluster for Standard mode.
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler https://cloud.google.com/kubernetes-engine/docs/how-to/resizing-node-pools https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture