Databricks cost optimization begins with visibility. Unlike traditional IaaS services, Databricks operates as an orchestration layer spanning compute, storage, and execution — but its billing data often lacks granularity by workload, job, or team. This creates a visibility gap: costs fluctuate without clear root causes, ownership is unclear, and optimization efforts stall due to lack of actionable insight. When costs are not attributed functionally — for example, to orchestration (query/job DBUs), compute (cloud VMs), storage, or data transfer — it becomes difficult to pinpoint what’s driving spend or where improvements can be made. As a result, inefficiencies persist not due to a single misconfiguration, but because the system lacks the structure to surface them.
CloudWatch log groups often persist long after their usefulness has expired. In some cases, they are associated with applications or resources that are no longer active. In other cases, the systems may still be running, but the log data is no longer being reviewed, analyzed, or used by any team. Regardless of the reason, retaining logs that no one is monitoring or using results in unnecessary storage costs. If log data is not needed for operational visibility, debugging, compliance, or auditing purposes, it should either be deleted or managed with a shorter retention policy.