By default, AWS Config can be set to record changes across all supported resource types, including those that change frequently, such as security group rules, IAM role policies, route tables, or network interfaces frequent ephemeral resources in containerized or auto-scaling setupsThese high-churn resources can generate an outsized number of configuration items and inflate costs — especially in dynamic or large-scale environments.
This inefficiency arises when recording is enabled indiscriminately across all resources without evaluating whether the data is necessary. Without targeted scoping, teams may incur large charges for configuration data that provides minimal value, especially in non-production environments.This can also obscure meaningful compliance signals by introducing noise
VPC Flow Logs configured with the ALL filter and delivered to CloudWatch Logs often result in unnecessarily high log ingestion volumes — especially in high-traffic environments. This setup is rarely required for day-to-day monitoring or security use cases but is commonly enabled by default or for temporary debugging and then left in place. As a result, teams incur excessive CloudWatch charges without realizing the logging configuration is misaligned with actual needs.
Teams often overuse Microsoft-hosted agents by running redundant or low-value jobs, failing to configure pipelines efficiently, or neglecting to use self-hosted agents for steady workloads. These inefficiencies result in unnecessary cost and delivery friction, especially when pipelines create queues due to limited agent availability.
By default, all Log Analytics tables are created under the Analytics plan, which is optimized for high-performance querying and interactive analysis. However, not all telemetry requires real-time access or frequent querying. Some tables may serve audit, archival, or compliance use cases where querying is rare or unnecessary. Leaving such tables on the Analytics plan results in unnecessary spend—especially when ingestion volumes are high or the table receives data from verbose sources (e.g., diagnostic logs, platform metrics).
Azure now allows users to assign different pricing plans at the table level, including the Basic plan, which offers significantly lower ingestion costs at the expense of reduced query functionality. This provides a valuable opportunity to align cost with access patterns by assigning less expensive plans to tables that are retained for record-keeping or compliance, rather than analysis.
While high-frequency alerting is sometimes justified for production SLAs, it's often overused across non-critical alerts or replicated blindly across environments. Projects with multiple environments (e.g., dev, QA, staging, prod) often duplicate alert rules without adjusting for business impact, which can lead to alert sprawl and inflated monitoring costs.
In large-scale environments, reducing the frequency of non-critical alerts—especially in lower environments—can yield significant savings. Teams often overlook this lever because alert configuration is considered part of operational hygiene rather than cost control. Tuning alert frequencies based on SLA requirements and actual urgency is a low-friction optimization opportunity that does not compromise observability when implemented thoughtfully.
By default, EventBridge includes retry mechanisms for delivery failures, particularly when targets like Lambda functions or Step Functions fail to process an event. However, if these retry policies are disabled or misconfigured, EventBridge may treat failed deliveries as successful, prompting upstream services to republish the same event multiple times in response to undelivered outcomes. This leads to: * Duplicate event publishing and delivery * Unnecessary compute triggered by repeated events * Increased EventBridge, downstream service, and data transfer costs This behavior is especially problematic in systems where idempotency is not strictly enforced and retries are managed externally by upstream services.
By default, CloudWatch Log Groups use the Standard log class, which applies higher rates for both ingestion and storage. AWS also offers an Infrequent Access (IA) log class designed for logs that are rarely queried — such as audit trails, debugging output, or compliance records. Many teams assume storage is the dominant cost driver in CloudWatch, but in high-volume environments, ingestion costs can account for the majority of spend. When logs that are infrequently accessed are ingested into the Standard class, it leads to unnecessary costs without impacting observability. The IA log class offers significantly reduced rates for ingestion and storage, making it a better fit for logs used primarily for post-incident review, compliance retention, or ad hoc forensic analysis.
By default, Cloud Logging retains logs for 30 days. However, many organizations increase retention to 90 days, 365 days, or longer — even for non-critical logs such as debug-level messages, transient system logs, or audit logs in dev environments. This extended retention can lead to unnecessary costs, especially when: * Logs are never queried after the first few days * Observability tooling duplicates logs elsewhere (e.g., SIEM platforms) * Retention settings are applied globally without considering log type or project criticality
Pub/Sub Lite is a cost-effective alternative to standard Pub/Sub, but it requires explicitly provisioning throughput capacity. When publish or subscribe throughput is overestimated, customers continue to pay for unused capacity — similar to idle virtual machines or overprovisioned IOPS. This inefficiency is often found in development environments or early-stage production workloads where traffic patterns are unpredictable or have since decreased.
Search Optimization can enable significant cost savings when selectively applied to workloads that heavily rely on point-lookup queries. By improving lookup efficiency, it allows smaller warehouses to satisfy performance SLAs, reducing credit consumption.
However, inefficiencies arise when:
Regular review of query patterns and warehouse sizing is essential to maximize the intended benefit of Search Optimization.