The Efficiency Hub

Inefficient Use of Photon Engine in Databricks Compute

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Inefficient Configuration

Photon is enabled by default on many Databricks compute configurations. While it can accelerate certain SQL and DataFrame operations, its performance benefits are workload-specific and may not justify the increased DBU cost. Many pipelines, particularly ETL jobs or simpler Spark workloads, do not benefit materially from Photon but still incur the higher DBU multiplier. Disabling Photon by default and allowing it only where proven beneficial can reduce cost without degrading performance.

Learn more

Missing Auto-Termination Policy for Databricks Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Missing Safeguard

In many environments, users launch Databricks clusters for development or analysis and forget to shut them down after use. When no auto-termination policy is configured, these clusters remain active indefinitely, incurring unnecessary charges for both Databricks and cloud infrastructure usage. This inefficiency is especially common in interactive clusters that are user-managed, ephemeral, or exploratory in nature. While Databricks provides built-in support for cluster auto-termination, teams often overlook it unless it's enforced through workspace policies. Without this safeguard in place, idle clusters can persist unnoticed for hours or days, leading to avoidable cost.

Learn more

Lack of Graviton Usage in Databricks Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Suboptimal Instance Selection

Databricks supports AWS Graviton-based instances for most workloads, including Spark jobs, data engineering pipelines, and interactive notebooks. These instances offer significant cost advantages over traditional x86-based VMs, with comparable or better performance in many cases. When teams default to legacy instance types, they miss an easy opportunity to reduce compute spend. Unless workloads have known compatibility issues or specialized requirements, Graviton should be the default instance family used in Databricks Clusters.

Learn more

Suboptimal Use of On-Demand Instances in Non-Production Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Suboptimal Pricing Model

In Databricks, on-demand instances provide reliable performance but come at a premium cost. For non-production workloads—such as development, testing, or exploratory analysis—high availability is often unnecessary. Spot instances provide equivalent performance at a lower price, with the tradeoff of occasional interruptions. If teams default to on-demand usage in lower environments, they may be incurring unnecessary compute costs. Using compute policies to limit on-demand usage ensures greater consistency and efficiency across environments.

Learn more

Oversized Worker or Driver Nodes in Databricks Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Overprovisioned Resource

Databricks users can select from a wide range of instance types for cluster driver and worker nodes. Without guardrails, teams may choose high-cost configurations (e.g., 16xlarge nodes) that exceed workload requirements. This results in inflated costs with little performance benefit. To reduce this risk, administrators can use compute policies to define acceptable node types and enforce size limits across the workspace.

Learn more

Inefficient Autotermination Configuration for Interactive Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Misconfiguration

Interactive clusters are often left running between periods of active use. To mitigate idle charges, Databricks provides an “autotermination” setting that shuts down clusters after a period of inactivity. However, if the termination period is set too high, or if policies do not enforce reasonable thresholds, idle clusters can persist for long durations without performing any work—resulting in wasted compute spend. Lowering the termination window reduces exposure to idle time while preserving user flexibility.

Learn more

Inefficient Use of Interactive Clusters

Compute

Cloud Provider

Databricks

Service Name

Databricks Clusters

Inefficiency Type

Misconfiguration

Interactive clusters are intended for development and ad-hoc analysis, remaining active until manually terminated. When used to run scheduled jobs or production workflows, they often stay idle between executions—leading to unnecessary infrastructure and DBU costs. Job clusters are designed for ephemeral, single-job execution and automatically terminate upon completion, reducing runtime and isolating workloads. Using interactive clusters for production jobs leads to cost inefficiencies and weaker workload boundaries.

Learn more

There are no inefficiency matches the current filters.