Unnecessary High-Resolution Custom Metrics Inflating API Call Costs

Taylor Houck

CER:

CER-0302

Service Category

Other

Cloud Provider

AWS

Service Name

AWS CloudWatch

Inefficiency Type

Inefficient Configuration

Explanation

Custom metrics published to CloudWatch can be configured at two resolutions: standard (60-second intervals) or high resolution (1-second intervals). While both resolutions are priced identically for metric storage, the critical cost difference lies in the volume of API calls required to publish the data. A metric published every second generates 60 times more API calls than one published every 60 seconds. At scale — across hundreds or thousands of custom metrics in a microservices architecture — this multiplier translates into substantial and avoidable API charges that accumulate month over month.

This inefficiency commonly arises when teams default to high-resolution publishing without evaluating whether sub-minute granularity is actually needed for their monitoring use cases. Many workloads — including capacity planning, cost analysis, and non-critical service monitoring — function perfectly well with standard or even lower resolution. Compounding the issue, high-resolution metric data is only retained at its full 1-second granularity for three hours before being automatically aggregated to coarser intervals. Teams may therefore be paying a premium in API costs for resolution they cannot even query historically. Additionally, if alarms are configured to evaluate high-resolution metrics at sub-minute intervals, those alarms carry a higher per-alarm charge compared to standard-resolution alarms.

Relevant Billing Model

CloudWatch custom metric costs are driven by two independent billing dimensions:

Metric storage — Billed per unique metric per month (each unique combination of metric name, namespace, and dimensions counts as a separate metric). Storage pricing is tiered: the first 10,000 metrics at $0.30/metric/month, with volume discounts at higher tiers. High-resolution and standard-resolution metrics are priced identically for storage.
API calls — PutMetricData requests are charged at $0.01 per 1,000 requests after the first 1 million requests per month (free tier). Publishing at 1-second intervals generates 60 times more API calls than publishing at 60-second intervals, making publishing frequency the primary cost lever.
Alarm charges — Alarms configured to evaluate at high-resolution periods (every 10 or 30 seconds) are charged at $0.30 per alarm metric per month, compared to $0.10 per alarm metric per month for standard-resolution alarms. However, alarms on high-resolution metrics can still use standard 60-second evaluation periods at the lower rate.

The free tier includes 10 custom metrics and 1 million API requests per month, but this is quickly exceeded in production environments with multiple services publishing metrics.

Detection

Identify custom metrics being published at high resolution (1-second intervals) and assess whether sub-minute granularity is required for each metric's monitoring purpose.
Review the volume of metric publishing API calls across accounts and regions to determine which namespaces or applications are generating the highest call volumes.
Evaluate whether any alarms or dashboards actually consume metric data at sub-minute granularity — if none do, the high-resolution publishing may be unnecessary.
Assess the dimension cardinality of high-resolution metrics, as each unique dimension combination creates a separate billable metric and multiplies the API call volume.
Confirm whether high-resolution metric data is being queried within its 3-hour full-resolution retention window, or whether teams are only viewing data after it has been aggregated to coarser intervals.
Examine alarm configurations to identify high-resolution alarms (evaluating at 10- or 30-second periods) that could function effectively with standard-resolution evaluation.

Remediation

Default all new custom metrics to standard resolution (60-second intervals) and require explicit justification for high-resolution publishing, reserving it for metrics that genuinely need sub-minute alerting such as latency-sensitive SLOs.
Audit existing high-resolution metrics and downgrade those that are not consumed at sub-minute granularity by any alarm, dashboard, or operational workflow.
Batch multiple metric data points into each API call to reduce the total number of requests — each call can include up to 1,000 metric data items, so maximizing batch size reduces per-call overhead.
Review and reduce dimension cardinality on high-frequency metrics to prevent metric count explosion — avoid using high-cardinality values like IP addresses or unique request identifiers as dimensions.
Where high-resolution alarms are in use, evaluate whether switching to standard-resolution alarm evaluation (60-second periods) would meet operational requirements at a lower per-alarm cost.
Establish governance policies and automated guardrails to prevent teams from inadvertently publishing metrics at high resolution without a documented need.

Relevant Documentation

Submit Feedback