Recommended vs Opt-In CPU Metrics
The Instrument Naming section
defines the *.usage, *.limit, *.utilization, and *.time metrics, but it
does not specify their
requirement levels
(required,recommended, opt-in). Because these metrics convey overlapping
information in different forms, implementations may become inconsistent without
explicit guidance.
This document provides guidance regarding the requirement level of the CPU metrics across the different areas of the Semantic Conventions.
Policy
- recommended:
*.cpu.time - opt-in (optional):
*.cpu.utilization,*.cpu.usage,*.cpu.limit_utilization,*.cpu.request_utilization
Rationale
*.cpu.time metrics are unambiguous, as they are measured directly from the
operating system or runtime. They aggregate cleanly across CPUs and resources,
support spatial aggregation, and form a consistent base for deriving usage and
utilization in backends or at the time of collection for convenience when
possible.
By contrast, *.cpu.usage and *.cpu.utilization are derived or
presentation-focused metrics. Their definitions may vary across implementations,
especially in containerized and Kubernetes environments where CPU limits are
defined per container or Pod. This leads to ambiguity and inconsistencies in how
these metrics should be calculated and reported. While they can be convenient
for dashboards and alerts, they should remain optional and only implemented when
specific environments explicitly provide them. For example
Kubelet’s stats endpoint
provides an opinionated metrics for *.cpu.usage that can be used directly, yet
should be optional since it is derived from the .cpu.time metrics and is not
uniquely implemented in other systems like the
Docker stats API.
Implementation Guidance
- SHOULD emit
*.cpu.timeby default for system, process container, and k8s resources. - SHOULD gate
*.cpu.*utilizationand*.cpu.usagemetrics behind explicit configuration.
Backend Guidance
- SHOULD provide transforms or views to derive utilization/usage from
*.cpu.timewhen helpful. - SHOULD treat
*.cpu.timeas the canonical source of truth across system, container, and k8s resources.
Using CPU Time
The cumulative CPU time values can be used to derive the utilization or usage metrics.
Usage metric can be computed using a rate() function with a given window,
dividing by the window size. CPU usage usually is measured in core-seconds.
Utilization can be computed using the above result divided by the given CPU limit and is usually in the range of [0, 1].
Examples of how the CPU time can be used to derive usage or utilization metrics, can be found bellow:
CPU Time to Usage
rate(system.cpu.time[5m])/(5*60) measured in core-seconds.
This is the PromQL equivalent. The rate() function is equivalent to the subtraction of the current to the previous value, while the denominator is the elapsed time in seconds.
CPU Time to Utilization
rate(system.cpu.time[5m])/(5*60) measured in [0, 1] per core (limit equals to
1 core)
rate(k8s.pod.cpu.time[5m])/(5*60)/k8s.pod.cpu.limit
The above will give the k8s.pod.cpu.limit_utilization derived metric.
Utilization excluding non-idle states
To represent the utilization as an expression of the percentage of time the
system spent in non-idle states, the following can be used:
sum(rate(system.cpu.time{cpu.mode!="idle"}[5m]) without (cpu.mode))/(5*60))
measured in [0, 1] per core.
Total utilization of the whole system
To get the utilization of the whole system, an average across all cores can be used:
avg(sum(rate(system.cpu.time{cpu.mode!="idle"}[5m])) by (cpu.logical_number))/(5*60)
Note that the above formulas can be ambigous and hence they are not standardized as part of the Semantic Conventions project. They are only provided as examples.
Projects like Prometheus Node Exporter come with their own formula for calculating System’s utilization.
The standardization of k8s.*.cpu.usage is an exception since it is collected
directly from the Kubelet’s Stats API and is K8s specific.
References
- System CPU Utilization gist by Braydon Kains (@braydonk)
- Attempt to introduce an optional normalized total CPU utilization metric
Feedback
Was this page helpful?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!