Oversized Worker or Driver Nodes in Databricks Clusters

Matt Weingarten

Service Category

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Overprovisioned Resource

Explanation

Databricks users can select from a wide range of instance types for cluster driver and worker nodes. Without guardrails, teams may choose high-cost configurations (e.g., 16xlarge nodes) that exceed workload requirements. This results in inflated costs with little performance benefit. To reduce this risk, administrators can use compute policies to define acceptable node types and enforce size limits across the workspace.

Relevant Billing Model

Databricks costs are driven by:

Databricks Units (DBUs): Billed per hour based on node type
Cloud Infrastructure Charges: Cost of underlying VMs, billed per second or minute

Larger node types (e.g., high-memory or high-I/O VMs) incur significantly higher charges. Oversizing clusters without justification leads to unnecessary DBU and infrastructure costs.

Detection

Review all cluster configurations to identify usage of large or high-cost instance types
Query system tables for driver and worker node types across clusters
Check whether compute policies are in place to limit allowable node sizes
Engage with workload owners to confirm whether large instances are justified based on workload characteristics

Remediation

Define and enforce compute policies that restrict driver and worker node types to appropriate sizes
Reconfigure existing clusters using oversized nodes to use smaller, cost-effective alternatives
Allow exceptions only for workloads that demonstrably require high-performance nodes

Relevant Documentation

Submit Feedback