When a Dataflow pipeline fails—often due to dependency issues, misconfigurations, or data format mismatches—its worker instances may remain active temporarily until the service terminates them. In some cases, misconfigured jobs, stuck retries, or delayed monitoring can cause workers to continue running for extended periods. These idle workers consume vCPU, memory, and storage resources without performing useful work. The inefficiency is compounded in large or high-frequency batch environments where repeated failures can leave many orphaned workers running concurrently.
Dataflow charges for the compute time of active workers, as well as associated resources such as persistent disks and networking. If pipeline failures prevent graceful shutdown or cleanup, these workers continue incurring compute charges even though no processing occurs.