Submit feedback on
Unnecessary High-Resolution Custom Metrics Inflating API Call Costs
We've received your feedback.
Thanks for reaching out!
Oops! Something went wrong while submitting the form.
Close
Unnecessary High-Resolution Custom Metrics Inflating API Call Costs
Taylor Houck
CER:

CER-0302

Service Category
Other
Cloud Provider
AWS
Service Name
AWS CloudWatch
Inefficiency Type
Inefficient Configuration
Explanation

Custom metrics published to CloudWatch can be configured at two resolutions: standard (60-second intervals) or high resolution (1-second intervals). While both resolutions are priced identically for metric storage, the critical cost difference lies in the volume of API calls required to publish the data. A metric published every second generates 60 times more API calls than one published every 60 seconds. At scale — across hundreds or thousands of custom metrics in a microservices architecture — this multiplier translates into substantial and avoidable API charges that accumulate month over month.

This inefficiency commonly arises when teams default to high-resolution publishing without evaluating whether sub-minute granularity is actually needed for their monitoring use cases. Many workloads — including capacity planning, cost analysis, and non-critical service monitoring — function perfectly well with standard or even lower resolution. Compounding the issue, high-resolution metric data is only retained at its full 1-second granularity for three hours before being automatically aggregated to coarser intervals. Teams may therefore be paying a premium in API costs for resolution they cannot even query historically. Additionally, if alarms are configured to evaluate high-resolution metrics at sub-minute intervals, those alarms carry a higher per-alarm charge compared to standard-resolution alarms.

Relevant Billing Model

CloudWatch custom metric costs are driven by two independent billing dimensions:

  • Metric storage — Billed per unique metric per month (each unique combination of metric name, namespace, and dimensions counts as a separate metric). Storage pricing is tiered: the first 10,000 metrics at $0.30/metric/month, with volume discounts at higher tiers. High-resolution and standard-resolution metrics are priced identically for storage.
  • API calls — PutMetricData requests are charged at $0.01 per 1,000 requests after the first 1 million requests per month (free tier). Publishing at 1-second intervals generates 60 times more API calls than publishing at 60-second intervals, making publishing frequency the primary cost lever.
  • Alarm charges — Alarms configured to evaluate at high-resolution periods (every 10 or 30 seconds) are charged at $0.30 per alarm metric per month, compared to $0.10 per alarm metric per month for standard-resolution alarms. However, alarms on high-resolution metrics can still use standard 60-second evaluation periods at the lower rate.

The free tier includes 10 custom metrics and 1 million API requests per month, but this is quickly exceeded in production environments with multiple services publishing metrics.

Detection
  • Identify custom metrics being published at high resolution (1-second intervals) and assess whether sub-minute granularity is required for each metric's monitoring purpose.
  • Review the volume of metric publishing API calls across accounts and regions to determine which namespaces or applications are generating the highest call volumes.
  • Evaluate whether any alarms or dashboards actually consume metric data at sub-minute granularity — if none do, the high-resolution publishing may be unnecessary.
  • Assess the dimension cardinality of high-resolution metrics, as each unique dimension combination creates a separate billable metric and multiplies the API call volume.
  • Confirm whether high-resolution metric data is being queried within its 3-hour full-resolution retention window, or whether teams are only viewing data after it has been aggregated to coarser intervals.
  • Examine alarm configurations to identify high-resolution alarms (evaluating at 10- or 30-second periods) that could function effectively with standard-resolution evaluation.
Remediation
  • Default all new custom metrics to standard resolution (60-second intervals) and require explicit justification for high-resolution publishing, reserving it for metrics that genuinely need sub-minute alerting such as latency-sensitive SLOs.
  • Audit existing high-resolution metrics and downgrade those that are not consumed at sub-minute granularity by any alarm, dashboard, or operational workflow.
  • Batch multiple metric data points into each API call to reduce the total number of requests — each call can include up to 1,000 metric data items, so maximizing batch size reduces per-call overhead.
  • Review and reduce dimension cardinality on high-frequency metrics to prevent metric count explosion — avoid using high-cardinality values like IP addresses or unique request identifiers as dimensions.
  • Where high-resolution alarms are in use, evaluate whether switching to standard-resolution alarm evaluation (60-second periods) would meet operational requirements at a lower per-alarm cost.
  • Establish governance policies and automated guardrails to prevent teams from inadvertently publishing metrics at high resolution without a documented need.
Submit Feedback