Semantic Conventions for Hardware Metrics

Status: Experimental

This document describes instruments and attributes for common hardware level metrics in OpenTelemetry. Consider the general metric semantic conventions when creating instruments not explicitly defined in the specification.

Warning Existing instrumentations and collector that are using v1.21.0 of this document (or prior):

  • SHOULD NOT adopt any breaking changes from document until the system semantic conventions are marked stable. Conventions include, but are not limited to, attributes, metric names, and unit of measure.
  • SHOULD introduce a control mechanism to allow users to opt-in to the new conventions once the migration plan is finalized.

Common hardware attributes

All metrics in hw. instruments should be attached to a Host Resource and therefore inherit its attributes, like host.id and host.name.

Additionally, all metrics in hw. instruments have the following attributes:

Attribute KeyDescriptionExampleRequirement Level
idAn identifier for the hardware component, unique within the monitored hostwin32battery_battery_testsysa33_1Required
nameAn easily-recognizable name for the hardware componenteth0Recommended
parentUnique identifier of the parent component (typically the id attribute of the enclosure, or disk controller)dellStorage_perc_0Recommended

Metric Instruments

hw. - Common hardware metrics

The below metrics apply to any type of hardware component.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute Key(s)Attribute Values
hw.energyEnergy consumed by the component, in joulesJCounterInt64
hw.errorsNumber of errors encountered by the component{error}CounterInt64hw.error.type (Recommended)
hw.powerInstantaneous power consumed by the component, in Watts (hw.energy is preferred)WGaugeDouble
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed

These common hw. metrics must include the below attributes to describe the monitored component:

Attribute KeyDescriptionExampleRequirement Level
hw.typeType of the componentbattery, cpu, disk_controller, enclosure, fan, gpu, logical_disk, memory, network, physical_disk, power_supply, tape_drive, temperature, voltageRequired

Warning

hw.status is currently specified as an UpDownCounter but would ideally be represented using a StateSet as defined in OpenMetrics. This semantic convention will be updated once StateSet is specified in OpenTelemetry. This planned change is not expected to have any consequence on the way users query their timeseries backend to retrieve the values of hw.status over time.

hw.host. - Physical host metrics

Description: Physical system as opposed to a virtual system or a container. Examples: physical server, switch or disk array.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute Key(s)Attribute Values
hw.host.ambient_temperatureAmbient (external) temperature of the physical hostCelGaugeDouble
hw.host.energyTotal energy consumed by the entire physical host, in joulesJCounterInt64
hw.host.heating_marginBy how many degrees Celsius the temperature of the physical host can be increased, before reaching a warning threshold on one of the internal sensorsCelGaugeDouble
hw.host.powerInstantaneous power consumed by the entire physical host in Watts (hw.host.energy is preferred)WGaugeDouble

Note The overall energy usage of a host MUST be reported using the specific hw.host.energy and hw.host.power metrics only, instead of the generic hw.energy and hw.power described in the previous section, to prevent summing up overlapping values.

hw.battery. - Battery metrics

Description: A battery in a computer system or an UPS.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute Key(s)Attribute Values
hw.battery.chargeRemaining fraction of battery charge1GaugeDouble
hw.battery.charge.limitLower limit of battery charge fraction to ensure proper operation1GaugeDoublelimit_type (Recommended)critical, throttled, degraded
hw.battery.time_leftTime left before battery is completely charged or dischargedsGaugeIntstate (Conditionally Required, if the battery is charging or discharging)charging, discharging
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, charging, discharging
hw.typebattery

All hw.battery. metrics may include the below Recommended attributes to describe the characteristics of the monitored battery:

Attribute KeyDescriptionExample
chemistryChemistry of the batteryNickel-Cadmium, Lithium-ion
capacityDesign capacity in Watts-hours or Amper-hours9.3Ah
modelDescriptive model name
vendorVendor name

hw.cpu. - Physical processor metrics

Description: Physical processor (as opposed to the logical processor seen by the operating system for multi-core systems). A physical processor may include many individual cores.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsTotal number of errors encountered and corrected by the CPU{error}CounterInt64hw.type (Required)cpu
hw.cpu.speedCPU current frequencyHzGaugeInt64
hw.cpu.speed.limitCPU maximum frequencyHzGaugeInt64limit_type (Recommended)throttled, max, turbo
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, predicted_failure
hw.type (Required)cpu

Additional Recommended attributes:

Attribute KeyDescriptionExample
modelDescriptive model name
vendorVendor name

hw.disk_controller. - Disk controller metrics

Description: Controller that controls the physical disks and organize them in RAID sets and logical disks that are exposed to the operating system.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)disk_controller

Additional Recommended attributes:

Attribute KeyDescriptionExample
bios_versionBIOS version
driver_versionDriver for the controller
firmware_versionFirmware version
modelDescriptive model name
serial_numberSerial number
vendorVendor name

hw.enclosure. - Enclosure metrics

Description: Computer chassis (can be an expansion enclosure)

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, open
hw.type (Required)enclosure

Additional Recommended attributes:

Attribute KeyDescriptionExample
bios_versionBIOS version
modelDescriptive model name
serial_numberSerial number
typeType of the enclosure (useful for modular systems)Computer, Storage, Switch
vendorVendor name

hw.fan. - Fan metrics

Description: Fan that keeps the air flowing to maintain the internal temperature of a computer

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.fan.speedFan speed in revolutions per minuterpmGaugeInt
hw.fan.speed.limitSpeed limit in rpmrpmGaugeIntlimit_type (Recommended)low.critical, low.degraded, max
hw.fan.speed_ratioFan speed expressed as a fraction of its maximum speed1GaugeDouble
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)fan

Additional Recommended attributes:

Attribute KeyDescriptionExample
sensor_locationLocation of the fan in the computer enclosurecpu0, ps1, INLET

hw.gpu. - GPU metrics

Description: Graphics Processing Unit (discrete)

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered by the GPU{error}CounterInt64hw.error.type (Recommended)corrected, uncorrected
hw.type (Required)gpu
hw.gpu.ioReceived and transmitted bytes by the GPUByCounterInt64direction (Required)receive, transmit
hw.gpu.memory.limitSize of the GPU memoryByUpDownCounterInt64
hw.gpu.memory.utilizationFraction of GPU memory used1GaugeDouble
hw.gpu.memory.usageGPU memory usedByUpDownCounterInt64
hw.gpu.powerGPU instantaneous power consumption in WattsWGaugeDouble
hw.gpu.utilizationFraction of time spent in a specific task1GaugeDoubletask (Recommended)decoder, encoder, general
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, predicted_failure
hw.type (Required)gpu

Additional Recommended attributes:

Attribute KeyDescriptionExample
driver_versionDriver for the controller
firmware_versionFirmware version
modelDescriptive model name
serial_numberSerial number
vendorVendor name

hw.logical_disk.- Logical disk metrics

Description: Storage extent presented as a physical disk by a disk controller to the operating system (e.g. a RAID 1 set made of 2 disks, and exposed as /dev/hdd0 by the controller).

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered on this logical disk{error}CounterInt64hw.type (Required)logical_disk
hw.logical_disk.limitSize of the logical diskByUpDownCounterInt64
hw.logical_disk.usageLogical disk space usageByUpDownCounterInt64state (Required)used, free
hw.logical_disk.utilizationLogical disk space utilization as a fraction1GaugeDoublestate (Required)used, free
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)logical_disk

Additional Recommended attributes:

Attribute KeyDescriptionExample
raid_levelRAID LevelRAID0+1

hw.memory. - Memory module metrics

Description: A memory module in a computer system.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered on this memory module{error}CounterInt64hw.type (Required)memory
hw.memory.sizeSize of the memory moduleByUpDownCounterInt64
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, predicted_failure
hw.type (Required)memory

Additional Recommended attributes:

Attribute KeyDescriptionExample
modelDescriptive model name
serial_numberSerial number
typeType of the memory moduleDDR5
vendorVendor name

hw.network. - Network adapter metrics

Description: A physical network interface, or a network interface controller (NIC), excluding software-based virtual adapters and loopbacks. For example, a physical network interface on a server, switch, router or firewall, an HBA, a fiber channel port or a Wi-Fi adapter.

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered by the network adapter{error}CounterInt64hw.error.type (Recommended)zero_buffer_credit, crc, etc.
hw.type (Required)network
direction (Recommended)receive, transmit
hw.network.bandwidth.limitLink speedBy/sUpDownCounterInt64
hw.network.bandwidth.utilizationUtilization of the network bandwidth as a fraction1GaugeDouble
hw.network.ioReceived and transmitted network traffic in bytesByCounterInt64direction (Required)receive, transmit
hw.network.packetsReceived and transmitted network traffic in packets (or frames){packet}CounterInt64direction (Required)receive, transmit
hw.network.upLink status: 1 (up) or 0 (down)UpDownCounterInt
hw.statusOperational status, regardless of the link status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)network

Additional Recommended attributes:

Attribute KeyDescriptionExample
modelDescriptive model name
logical_addressesLogical addresses of the adapter (e.g. IP address, or WWPN)172.16.8.21, 57.11.193.42
physical_addressPhysical address of the adapter (e.g. MAC address, or WWNN)00-90-F5-E9-7B-36
serial_numberSerial number
vendorVendor name

hw.physical_disk.- Physical disk metrics

Description: Physical hard drive (HDD or SDD)

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered on this disk{error}CounterInt64hw.error.type (Recommended)bad_sector, write, etc.
hw.type (Required)physical_disk
hw.physical_disk.endurance_utilizationEndurance remaining for this SSD disk1GaugeDoublestate (Required)remaining
hw.physical_disk.sizeSize of the diskByUpDownCounterInt64
hw.physical_disk.smartValue of the corresponding S.M.A.R.T. attribute1GaugeIntsmart_attribute (Recommended)Seek Error Rate, Spin Retry Count, etc.
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, predicted_failure
hw.type (Required)physical_disk

Additional Recommended attributes:

Attribute KeyDescriptionExample
firmware_versionFirmware version
modelDescriptive model name
serial_numberSerial number
typeType of the diskHDD, SSD, 10K
vendorVendor name

hw.power_supply. - Power supply metrics

Description: Power supply converting AC current to DC used by the motherboard and the GPUs

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.power_supply.limitMaximum power output of the power supplyWUpDownCounterInt64limit_type (Recommended)max, critical, throttled
hw.power_supply.utilizationUtilization of the power supply as a fraction of its maximum output1GaugeDouble
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)power_supply

Additional Recommended attributes:

Attribute KeyDescriptionExample
modelDescriptive model name
serial_numberSerial number
vendorVendor name

hw.tape_drive. - Tape drive metrics

Description: A tape drive in a computer or in a tape library (excluding virtual tape libraries)

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.errorsNumber of errors encountered by the tape drive{error}CounterInt64hw.error.typeread, write, mount, etc.
hw.type (Required)tape_drive
hw.tape_drive.operationsOperations performed by the tape drive{operation}CounterInt64type (Recommended)mount, unmount, clean
hw.statusOperational status: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed, needs_cleaning
hw.type (Required)tape_drive

Additional Recommended attributes:

Attribute KeyDescriptionExample
modelDescriptive model name
serial_numberSerial number
vendorVendor name

hw.temperature. - Temperature sensor metrics

Description: A temperature sensor, either numeric or discrete

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.temperatureTemperature in degrees CelsiusCelGaugeDouble
hw.temperature.limitTemperature limit in degrees CelsiusCelGaugeDoublelimit_type (Recommended)low.critical, low.degraded, high.degraded, high.critical
hw.statusWhether the temperature is within normal range: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)temperature

Additional Recommended attributes:

Attribute KeyDescriptionExample
sensor_locationLocation of the sensorCPU0_DIE

hw.voltage. - Voltage sensor metrics

Description: A voltage sensor, either numeric or discrete

NameDescriptionUnitsInstrument Type (*)Value TypeAttribute KeyAttribute Values
hw.voltage.limitVoltage limit in VoltsVGaugeDoublelimit_type (Recommended)low.critical, low.degraded, high.degraded, high.critical
hw.voltage.nominalNominal (expected) voltageVGaugeDouble
hw.voltageVoltage measured by the sensorVGaugeDouble
hw.statusWhether the voltage is within normal range: 1 (true) or 0 (false) for each of the possible statesUpDownCounterIntstate (Required)ok, degraded, failed
hw.type (Required)voltage

Additional Recommended attributes:

Attribute KeyDescriptionExample
sensor_locationLocation of the sensorPS0 V3_3