Semantic Conventions for System Metrics

Status: Experimental

This document describes instruments and attributes for common system level metrics in OpenTelemetry. Consider the general metric semantic conventions when creating instruments not explicitly defined in the specification.

The system.* namespace SHOULD be exclusively used to report hosts’ metrics. The system.* namespace SHOULD only be used when the metrics are collected from within the target system. (physical servers, virtual machines etc). Metrics collected from technology-specific, well-defined APIs (e.g. Kubelet’s API or container runtimes) should be reported under their respective namespace (e.g. k8s., container.). Resource attributes related to a host, SHOULD be reported under the host.* namespace.

Warning Existing instrumentations and collector that are using v1.21.0 of this document (or prior):

  • SHOULD NOT adopt any breaking changes from document until the system semantic conventions are marked stable. Conventions include, but are not limited to, attributes, metric names, and unit of measure.
  • SHOULD introduce a control mechanism to allow users to opt-in to the new conventions once the migration plan is finalized.

Processor Metrics

Description: System level processor metrics captured under the namespace system.cpu.

Metric: system.cpu.time

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.cpu.timeCountersSeconds each logical CPU spent on each modeExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.cpu.logical_numberintThe logical CPU number [0..n-1]1RecommendedExperimental
system.cpu.statestringThe CPU state for this data point. A system’s CPU SHOULD be characterized either by data points with no state labels, or only data points with state labels.idle; interruptRecommendedExperimental

system.cpu.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
useruserExperimental
systemsystemExperimental
niceniceExperimental
idleidleExperimental
iowaitiowaitExperimental
interruptinterruptExperimental
stealstealExperimental

Metric: system.cpu.utilization

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.cpu.utilizationGauge1Difference in system.cpu.time since the last measurement, divided by the elapsed time and number of logical CPUsExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.cpu.logical_numberintThe logical CPU number [0..n-1]1RecommendedExperimental
system.cpu.statestringThe CPU state for this data point. A system’s CPU SHOULD be characterized either by data points with no state labels, or only data points with state labels.idle; interruptRecommendedExperimental

system.cpu.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
useruserExperimental
systemsystemExperimental
niceniceExperimental
idleidleExperimental
iowaitiowaitExperimental
interruptinterruptExperimental
stealstealExperimental

Metric: system.cpu.physical.count

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.cpu.physical.countUpDownCounter{cpu}Reports the number of actual physical processor cores on the hardwareExperimental

Metric: system.cpu.logical.count

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.cpu.logical.countUpDownCounter{cpu}Reports the number of logical (virtual) processor cores created by the operating system to manage multitaskingExperimental

Metric: system.cpu.frequency

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.cpu.frequencyGauge{Hz}Reports the current frequency of the CPU in HzExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.cpu.logical_numberintThe logical CPU number [0..n-1]1RecommendedExperimental

Memory Metrics

Description: System level memory metrics capture under the namespace system.memory. This does not include paging/swap memory.

Metric: system.memory.usage

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.memory.usageUpDownCounterByReports memory in use by state. [1]Experimental

[1]: The sum over all system.memory.state values SHOULD equal the total memory available on the system, that is system.memory.limit.

AttributeTypeDescriptionExamplesRequirement LevelStability
system.memory.statestringThe memory statefree; cachedRecommendedExperimental

system.memory.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
usedusedExperimental
freefreeExperimental
sharedsharedExperimental
buffersbuffersExperimental
cachedcachedExperimental

Metric: system.memory.limit

This metric is opt-in.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.memory.limitUpDownCounterByTotal memory available in the system. [1]Experimental

[1]: Its value SHOULD equal the sum of system.memory.state over all states.

Metric: system.memory.utilization

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.memory.utilizationGauge1Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.memory.statestringThe memory statefree; cachedRecommendedExperimental

system.memory.state has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
usedusedExperimental
freefreeExperimental
sharedsharedExperimental
buffersbuffersExperimental
cachedcachedExperimental

Paging/Swap Metrics

Description: System level paging/swap memory metrics captured under the namespace system.paging.

Metric: system.paging.usage

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.paging.usageUpDownCounterByUnix swap or windows pagefile usageExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.paging.statestringThe memory paging statefreeRecommendedExperimental

system.paging.state MUST be one of the following:

ValueDescriptionStability
usedusedExperimental
freefreeExperimental

Metric: system.paging.utilization

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.paging.utilizationGauge1Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.paging.statestringThe memory paging statefreeRecommendedExperimental

system.paging.state MUST be one of the following:

ValueDescriptionStability
usedusedExperimental
freefreeExperimental

Metric: system.paging.faults

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.paging.faultsCounter{fault}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.paging.typestringThe memory paging typeminorRecommendedExperimental

system.paging.type MUST be one of the following:

ValueDescriptionStability
majormajorExperimental
minorminorExperimental

Metric: system.paging.operations

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.paging.operationsCounter{operation}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.paging.directionstringThe paging access directioninRecommendedExperimental
system.paging.typestringThe memory paging typeminorRecommendedExperimental

system.paging.direction MUST be one of the following:

ValueDescriptionStability
ininExperimental
outoutExperimental

system.paging.type MUST be one of the following:

ValueDescriptionStability
majormajorExperimental
minorminorExperimental

Disk Controller Metrics

Description: System level disk performance metrics captured under the namespace system.disk.

Metric: system.disk.io

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.disk.ioCounterByExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
disk.io.directionstringThe disk IO operation direction.readRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

disk.io.direction MUST be one of the following:

ValueDescriptionStability
readreadExperimental
writewriteExperimental

Metric: system.disk.operations

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.disk.operationsCounter{operation}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
disk.io.directionstringThe disk IO operation direction.readRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

disk.io.direction MUST be one of the following:

ValueDescriptionStability
readreadExperimental
writewriteExperimental

Metric: system.disk.io_time

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.disk.io_timeCountersTime disk spent activated [1]Experimental

[1]: The real elapsed time (“wall clock”) used in the I/O path (time from operations running in parallel are not counted). Measured as:

AttributeTypeDescriptionExamplesRequirement LevelStability
system.devicestringThe device identifier(identifier)RecommendedExperimental

Metric: system.disk.operation_time

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.disk.operation_timeCountersSum of the time each operation took to complete [1]Experimental

[1]: Because it is the sum of time each request took, parallel-issued requests each contribute to make the count grow. Measured as:

  • Linux: Fields 7 & 11 from procfs-diskstats
  • Windows: “Avg. Disk sec/Read” perf counter multiplied by “Disk Reads/sec” perf counter (similar for Writes)
AttributeTypeDescriptionExamplesRequirement LevelStability
disk.io.directionstringThe disk IO operation direction.readRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

disk.io.direction MUST be one of the following:

ValueDescriptionStability
readreadExperimental
writewriteExperimental

Metric: system.disk.merged

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.disk.mergedCounter{operation}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
disk.io.directionstringThe disk IO operation direction.readRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

disk.io.direction MUST be one of the following:

ValueDescriptionStability
readreadExperimental
writewriteExperimental

Filesystem Metrics

Description: System level filesystem metrics captured under the namespace system.filesystem.

Metric: system.filesystem.usage

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.filesystem.usageUpDownCounterByExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.devicestringThe device identifier(identifier)RecommendedExperimental
system.filesystem.modestringThe filesystem moderw, roRecommendedExperimental
system.filesystem.mountpointstringThe filesystem mount path/mnt/dataRecommendedExperimental
system.filesystem.statestringThe filesystem stateusedRecommendedExperimental
system.filesystem.typestringThe filesystem typeext4RecommendedExperimental

system.filesystem.state MUST be one of the following:

ValueDescriptionStability
usedusedExperimental
freefreeExperimental
reservedreservedExperimental

system.filesystem.type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
fat32fat32Experimental
exfatexfatExperimental
ntfsntfsExperimental
refsrefsExperimental
hfsplushfsplusExperimental
ext4ext4Experimental

Metric: system.filesystem.utilization

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.filesystem.utilizationGauge1Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.devicestringThe device identifier(identifier)RecommendedExperimental
system.filesystem.modestringThe filesystem moderw, roRecommendedExperimental
system.filesystem.mountpointstringThe filesystem mount path/mnt/dataRecommendedExperimental
system.filesystem.statestringThe filesystem stateusedRecommendedExperimental
system.filesystem.typestringThe filesystem typeext4RecommendedExperimental

system.filesystem.state MUST be one of the following:

ValueDescriptionStability
usedusedExperimental
freefreeExperimental
reservedreservedExperimental

system.filesystem.type has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
fat32fat32Experimental
exfatexfatExperimental
ntfsntfsExperimental
refsrefsExperimental
hfsplushfsplusExperimental
ext4ext4Experimental

Network Metrics

Description: System level network metrics captured under the namespace system.network.

Metric: system.network.dropped

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.network.droppedCounter{packet}Count of packets that are dropped or discarded even though there was no error [1]Experimental

[1]: Measured as:

AttributeTypeDescriptionExamplesRequirement LevelStability
network.io.directionstringThe network IO operation direction.transmitRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

network.io.direction MUST be one of the following:

ValueDescriptionStability
transmittransmitExperimental
receivereceiveExperimental

Metric: system.network.packets

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.network.packetsCounter{packet}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
network.io.directionstringThe network IO operation direction.transmitRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

network.io.direction MUST be one of the following:

ValueDescriptionStability
transmittransmitExperimental
receivereceiveExperimental

Metric: system.network.errors

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.network.errorsCounter{error}Count of network errors detected [1]Experimental

[1]: Measured as:

AttributeTypeDescriptionExamplesRequirement LevelStability
network.io.directionstringThe network IO operation direction.transmitRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

network.io.direction MUST be one of the following:

ValueDescriptionStability
transmittransmitExperimental
receivereceiveExperimental

Metric: system.network.io

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.network.ioCounterByExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
network.io.directionstringThe network IO operation direction.transmitRecommendedExperimental
system.devicestringThe device identifier(identifier)RecommendedExperimental

network.io.direction MUST be one of the following:

ValueDescriptionStability
transmittransmitExperimental
receivereceiveExperimental

Metric: system.network.connections

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.network.connectionsUpDownCounter{connection}Experimental
AttributeTypeDescriptionExamplesRequirement LevelStability
network.transportstringOSI transport layer or inter-process communication method. [1]tcp; udpRecommendedStable
system.devicestringThe device identifier(identifier)RecommendedExperimental
system.network.statestringA stateless protocol MUST NOT set this attributeclose_waitRecommendedExperimental

[1]: The value SHOULD be normalized to lowercase.

Consider always setting the transport when setting a port number, since a port number is ambiguous without knowing the transport. For example different processes could be listening on TCP port 12345 and UDP port 12345.

network.transport has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
tcpTCPStable
udpUDPStable
pipeNamed or anonymous pipe.Stable
unixUnix domain socketStable

system.network.state MUST be one of the following:

ValueDescriptionStability
closecloseExperimental
close_waitclose_waitExperimental
closingclosingExperimental
deletedeleteExperimental
establishedestablishedExperimental
fin_wait_1fin_wait_1Experimental
fin_wait_2fin_wait_2Experimental
last_acklast_ackExperimental
listenlistenExperimental
syn_recvsyn_recvExperimental
syn_sentsyn_sentExperimental
time_waittime_waitExperimental

Aggregate System Process Metrics

Description: System level aggregate process metrics captured under the namespace system.process. For metrics at the individual process level, see process metrics.

Metric: system.process.count

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.process.countUpDownCounter{process}Total number of processes in each stateExperimental
AttributeTypeDescriptionExamplesRequirement LevelStability
system.process.statusstringThe process state, e.g., Linux Process State CodesrunningRecommendedExperimental

system.process.status has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

ValueDescriptionStability
runningrunningExperimental
sleepingsleepingExperimental
stoppedstoppedExperimental
defunctdefunctExperimental

Metric: system.process.created

This metric is recommended.

NameInstrument TypeUnit (UCUM)DescriptionStability
system.process.createdCounter{process}Total number of processes created over uptime of the hostExperimental

system.{os}. - OS Specific System Metrics

Instrument names for system level metrics that have different and conflicting meaning across multiple OSes should be prefixed with system.{os}. and follow the hierarchies listed above for different entities like CPU, memory, and network.

For example, UNIX load average over a given interval is not well standardized and its value across different UNIX like OSes may vary despite being under similar load:

Without getting into the vagaries of every Unix-like operating system in existence, the load average more or less represents the average number of processes that are in the running (using the CPU) or runnable (waiting for the CPU) states. One notable exception exists: Linux includes processes in uninterruptible sleep states, typically waiting for some I/O activity to complete. This can markedly increase the load average on Linux systems.

(source of quote, linux source code)

An instrument for load average over 1 minute on Linux could be named system.linux.cpu.load_1m, reusing the cpu name proposed above and having an {os} prefix to split this metric across OSes.

Metric: system.linux.memory.available

NameInstrument TypeUnit (UCUM)DescriptionStability
system.linux.memory.availableUpDownCounterByAn estimate of how much memory is available for starting new applications, without causing swapping [1]Experimental

[1]: This is an alternative to system.memory.usage metric with state=free. Linux starting from 3.14 exports “available” memory. It takes “free” memory as a baseline, and then factors in kernel-specific values. This is supposed to be more accurate than just “free” memory. For reference, see the calculations here. See also MemAvailable in /proc/meminfo.