Semantic conventions for database client spans
Status: Stable, Unless otherwise specified.
Warning
Existing database instrumentations that are using v1.24.0 of this document (or prior):
- SHOULD NOT change the version of the database conventions that they emit by default in their existing major version. Conventions include (but are not limited to) attributes, metric and span names, and unit of measure.
- SHOULD introduce an environment variable
OTEL_SEMCONV_STABILITY_OPT_IN
in their existing major version as a comma-separated list of category-specific values (e.g., http, databases, messaging). The list of values includes:
database
- emit the stable database conventions, and stop emitting the experimental database conventions that the instrumentation emitted previously.database/dup
- emit both the experimental and stable database conventions, allowing for a phased rollout of the stable semantic conventions.- The default behavior (in the absence of one of these values) is to continue emitting whatever version of the old experimental database conventions the instrumentation was emitting previously.
- Note:
database/dup
has higher precedence thandatabase
in case both values are present- SHOULD maintain (security patching at a minimum) their existing major version for at least six months after it starts emitting both sets of conventions.
- MAY drop the environment variable in their next major version and emit only the stable database conventions.
Name
Database spans MUST follow the overall guidelines for span names.
The span name SHOULD be {db.query.summary}
if a summary is available.
If no summary is available, the span name SHOULD be {db.operation.name} {target}
provided that a (low-cardinality) db.operation.name
is available (see below for
the exact definition of the {target}
placeholder).
If a (low-cardinality) db.operation.name
is not available, database span names
SHOULD default to the {target}
.
If neither {db.operation.name}
nor {target}
are available, span name SHOULD be {db.system.name}
.
Semantic conventions for individual database systems MAY specify different span name format.
The {target}
SHOULD describe the entity that the operation is performed against
and SHOULD adhere to one of the following values, provided they are accessible:
db.collection.name
SHOULD be used for operations on a specific database collection.db.stored_procedure.name
SHOULD be used for operations on a specific stored procedure.db.namespace
SHOULD be used for operations on a specific database namespace.server.address:server.port
SHOULD be used for other operations not targeting any specific collection(s), stored procedure(s), or namespace(s).
If a corresponding {target}
value is not available for a specific operation, the instrumentation SHOULD omit the {target}
.
For example, for an operation describing SQL query on an anonymous table like SELECT * FROM (SELECT * FROM table) t
, span name should be SELECT
.
Span definition
Status:
This span describes database client call.
Instrumentations SHOULD, when possible, record database spans that cover the duration of the corresponding API call as if it was observed by the caller (such as client application). For example, if a transient issue happened and was retried within this database call, the corresponding span should cover the duration of the logical operation with all retries.
When a database client provides higher-level convenience APIs for specific operations
(e.g., calling a stored procedure), which internally generate and execute a generic query,
it is RECOMMENDED to instrument the higher-level convenience APIs.
These often allow setting db.operation.*
attributes, which usually are not
readily available at the generic query level.
Span name is covered in the Name section.
Span kind SHOULD be CLIENT
. It MAY be set to INTERNAL
on spans representing
in-memory database calls.
It’s RECOMMENDED to use CLIENT
kind when database system being instrumented usually
runs in a different process than its client or when database calls happen over
instrumented protocol such as HTTP.
Span status Refer to the Recording Errors
document for details on how to record span status. Semantic conventions for
individual systems SHOULD specify which values of db.response.status_code
classify as errors.
Attribute | Type | Description | Examples | Requirement Level | Stability |
---|---|---|---|---|---|
db.system.name | string | The database management system (DBMS) product as identified by the client instrumentation. [1] | other_sql ; softwareag.adabas ; actian.ingres | Required | |
db.collection.name | string | The name of a collection (table, container) within the database. [2] | public.users ; customers | Conditionally Required [3] | |
db.namespace | string | The name of the database, fully qualified within the server address and port. [4] | customers ; test.users | Conditionally Required If available. | |
db.operation.name | string | The name of the operation or command being executed. [5] | findAndModify ; HMSET ; SELECT | Conditionally Required [6] | |
db.response.status_code | string | Database response status code. [7] | 102 ; ORA-17002 ; 08P01 ; 404 | Conditionally Required [8] | |
error.type | string | Describes a class of error the operation ended with. [9] | timeout ; java.net.UnknownHostException ; server_certificate_invalid ; 500 | Conditionally Required If and only if the operation failed. | |
server.port | int | Server port number. [10] | 80 ; 8080 ; 443 | Conditionally Required [11] | |
db.operation.batch.size | int | The number of queries included in a batch operation. [12] | 2 ; 3 ; 4 | Recommended | |
db.query.summary | string | Low cardinality summary of a database query. [13] | SELECT wuser_table ; INSERT shipping_details SELECT orders ; get user by id | Recommended [14] | |
db.query.text | string | The database query being executed. [15] | SELECT * FROM wuser_table where username = ? ; SET mykey ? | Recommended [16] | |
db.stored_procedure.name | string | The name of a stored procedure within the database. [17] | GetCustomer | Recommended [18] | |
network.peer.address | string | Peer address of the database node where the operation was performed. [19] | 10.1.2.80 ; /tmp/my.sock | Recommended If applicable for this database system. | |
network.peer.port | int | Peer port number of the network connection. | 65123 | Recommended if and only if network.peer.address is set. | |
server.address | string | Name of the database host. [20] | example.com ; 10.1.2.80 ; /tmp/my.sock | Recommended | |
db.query.parameter.<key> | string | A database query parameter, with <key> being the parameter name, and the attribute value being a string representation of the parameter value. [21] | someval ; 55 | Opt-In | |
db.response.returned_rows | int | Number of rows returned by the operation. | 10 ; 30 ; 1000 | Opt-In |
[1] db.system.name
: The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the db.system.name
is set to postgresql
based on the instrumentation’s best knowledge.
[2] db.collection.name
: It is RECOMMENDED to capture the value as provided by the application
without attempting to do any case normalization.
The collection name SHOULD NOT be extracted from db.query.text
,
when the database system supports query text with multiple collections
in non-batch operations.
For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used.
[3] db.collection.name
: If readily available and if a database call is performed on a single collection.
[4] db.namespace
: If a database system has multiple namespace components, they SHOULD be concatenated from the most general to the most specific namespace component, using |
as a separator between the components. Any missing components (and their associated separators) SHOULD be omitted.
Semantic conventions for individual database systems SHOULD document what db.namespace
means in the context of that system.
It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization.
[5] db.operation.name
: It is RECOMMENDED to capture the value as provided by the application
without attempting to do any case normalization.
The operation name SHOULD NOT be extracted from db.query.text
,
when the database system supports query text with multiple operations
in non-batch operations.
If spaces can occur in the operation name, multiple consecutive spaces SHOULD be normalized to a single space.
For batch operations, if the individual operations are known to have the same operation name
then that operation name SHOULD be used prepended by BATCH
,
otherwise db.operation.name
SHOULD be BATCH
or some other database
system specific term if more applicable.
[6] db.operation.name
: If readily available and if there is a single operation name that describes the database call.
[7] db.response.status_code
: The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes.
Semantic conventions for individual database systems SHOULD document what db.response.status_code
means in the context of that system.
[8] db.response.status_code
: If the operation failed and status code is available.
[9] error.type
: The error.type
SHOULD match the db.response.status_code
returned by the database or the client library, or the canonical name of exception that occurred.
When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred.
Instrumentations SHOULD document how error.type
is populated.
[10] server.port
: When observed from the client side, and when communicating through an intermediary, server.port
SHOULD represent the server port behind any intermediaries, for example proxies, if it’s available.
[11] server.port
: If using a port other than the default port for this DBMS and if server.address
is set.
[12] db.operation.batch.size
: Operations are only considered batches when they contain two or more operations, and so db.operation.batch.size
SHOULD never be 1
.
[13] db.query.summary
: The query summary describes a class of database queries and is useful
as a grouping key, especially when analyzing telemetry for database
calls involving complex queries.
Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following Generating query summary section.
[14] db.query.summary
: if available through instrumentation hooks or if the instrumentation supports generating a query summary.
[15] db.query.text
: For sanitization see Sanitization of db.query.text
.
For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator ;
or some other database system specific separator if more applicable.
Parameterized query text SHOULD NOT be sanitized. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.
[16] db.query.text
: Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See Sanitization of db.query.text
.
Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see db.query.parameter.<key>
).
[17] db.stored_procedure.name
: It is RECOMMENDED to capture the value as provided by the application
without attempting to do any case normalization.
For batch operations, if the individual operations are known to have the same stored procedure name then that stored procedure name SHOULD be used.
[18] db.stored_procedure.name
: If operation applies to a specific stored procedure.
[19] network.peer.address
: Semantic conventions for individual database systems SHOULD document whether network.peer.*
attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly.
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used.
[20] server.address
: When observed from the client side, and when communicating through an intermediary, server.address
SHOULD represent the server address behind any intermediaries, for example proxies, if it’s available.
[21] db.query.parameter.<key>
: If a query parameter has no name and instead is referenced only by index,
then <key>
SHOULD be the 0-based index.
db.query.parameter.<key>
SHOULD match
up with the parameterized placeholders present in db.query.text
.
db.query.parameter.<key>
SHOULD NOT be captured on batch operations.
Examples:
For a query
SELECT * FROM users where username = %s
with the parameter"jdoe"
, the attributedb.query.parameter.0
SHOULD be set to"jdoe"
.For a query
"SELECT * FROM users WHERE username = %(username)s;
with parameterusername = "jdoe"
, the attributedb.query.parameter.username
SHOULD be set to"jdoe"
.
The following attributes can be important for making sampling decisions and SHOULD be provided at span creation time (if provided at all):
db.collection.name
db.namespace
db.operation.name
db.query.summary
db.query.text
db.system.name
server.address
server.port
db.system.name
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
actian.ingres | Actian Ingres | |
aws.dynamodb | Amazon DynamoDB | |
aws.redshift | Amazon Redshift | |
azure.cosmosdb | Azure Cosmos DB | |
cassandra | Apache Cassandra | |
clickhouse | ClickHouse | |
cockroachdb | CockroachDB | |
couchbase | Couchbase | |
couchdb | Apache CouchDB | |
derby | Apache Derby | |
elasticsearch | Elasticsearch | |
firebirdsql | Firebird | |
gcp.spanner | Google Cloud Spanner | |
geode | Apache Geode | |
h2database | H2 Database | |
hbase | Apache HBase | |
hive | Apache Hive | |
hsqldb | HyperSQL Database | |
ibm.db2 | IBM Db2 | |
ibm.informix | IBM Informix | |
ibm.netezza | IBM Netezza | |
influxdb | InfluxDB | |
instantdb | Instant | |
intersystems.cache | InterSystems Caché | |
mariadb | MariaDB | |
memcached | Memcached | |
microsoft.sql_server | Microsoft SQL Server | |
mongodb | MongoDB | |
mysql | MySQL | |
neo4j | Neo4j | |
opensearch | OpenSearch | |
oracle.db | Oracle Database | |
other_sql | Some other SQL database. Fallback only. | |
postgresql | PostgreSQL | |
redis | Redis | |
sap.hana | SAP HANA | |
sap.maxdb | SAP MaxDB | |
softwareag.adabas | Adabas (Adaptable Database System) | |
sqlite | SQLite | |
teradata | Teradata | |
trino | Trino |
error.type
has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
Value | Description | Stability |
---|---|---|
_OTHER | A fallback error value to be used when the instrumentation doesn’t define a custom value. |
Notes and well-known identifiers for db.system.name
The list above is a non-exhaustive list of well-known identifiers to be specified for db.system.name
.
If a value defined in this list applies to the DBMS to which the request is sent, this value MUST be used. If no value defined in this list is suitable, a custom value MUST be provided. This custom value MUST be the name of the DBMS in lowercase and without a version number to stay consistent with existing identifiers.
It is encouraged to open a PR towards this specification to add missing values to the list, especially when instrumentations for those missing databases are written. This allows multiple instrumentations for the same database to be aligned and eases analyzing for backends.
The value other_sql
is intended as a fallback and MUST only be used if the DBMS is known to be SQL-compliant but the concrete product is not known to the instrumentation.
If the concrete DBMS is known to the instrumentation, its specific identifier MUST be used.
Back ends could, for example, use the provided identifier to determine the appropriate SQL dialect for parsing the db.query.text
.
When additional attributes are added that only apply to a specific DBMS, its identifier SHOULD be used as a namespace in the attribute key as for the attributes in the sections below.
Sanitization of db.query.text
The db.query.text
SHOULD be collected by default only if there is sanitization that excludes sensitive information.
Sanitization SHOULD replace all literals with a placeholder value.
Such literals include, but are not limited to, String, Numeric, Date and Time,
Boolean, Interval, Binary, and Hexadecimal literals.
The placeholder value SHOULD be ?
, unless it already has a defined meaning in the given database system,
in which case the instrumentation MAY choose a different placeholder.
Parameterized query text SHOULD NOT be sanitized. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk.
IN-clauses MAY be collapsed during sanitization,
e.g. from IN (?, ?, ?, ?)
to IN (?)
, as this can help with extremely long IN-clauses,
and can help control cardinality for users who choose to (optionally) add db.query.text
to their metric attributes.
When performing sanitization, instrumentation MAY truncate the sanitized value for performance considerations (since sanitizing has a performance cost).
Generating a summary of the query
The db.query.summary
attribute can be used to capture a shortened representation
of the query. It SHOULD have low-cardinality and SHOULD NOT contain any dynamic
or sensitive data.
[!NOTE]
The
db.query.text
attribute is intended to identify individual queries. Even though it is sanitized if captured by default, it could still have high cardinality and might reach hundreds of lines.The
db.query.summary
is intended to provide a less granular grouping key that can be used as a span name or a metric attribute in common cases. It SHOULD only contain information that has a significant impact on the query, database, or application performance.
Instrumentation SHOULD set the query summary if it is readily available through instrumentation hooks or other sources.
Otherwise:
When instrumenting higher-level APIs that build queries internally - for example, those that create a table or execute a stored procedure - instrumentations SHOULD generate a
db.query.summary
from available operation(s) and target(s) using the format described in this section.When instrumenting APIs that operate at the query level, instrumentations that support query parsing SHOULD generate a query summary based on the
db.query.text
.
The summary SHOULD preserve the following parts of query in the order they were provided:
- operations such as SQL SELECT, INSERT, UPDATE, DELETE, and other commands
- operation targets such as collections, stored procedures, database names, etc
Instrumentations that support query parsing SHOULD parse the query and extract a
list of operations and targets from the query. It SHOULD set db.query.summary
attribute to the value formatted in the following way:
{operation1} {target1} {operation2} {target2} {target3} ...
Instrumentations SHOULD capture the values of operations and targets as provided
by the application without attempting to do any case normalization. If the operation
and target value is populated on db.operation.name
, db.collection.name
,
or other attributes, it SHOULD match the value used in the db.query.summary
.
Instrumentations that parse the query to set db.query.summary
SHOULD truncate the
summary to 255 characters (ensuring truncation does not occur within an operation
name or target).
Examples:
Query that consist of a single operation:
SELECT * FROM wuser_table WHERE username = ?
the corresponding
db.query.summary
isSELECT wuser_table
.Query that performs multiple operations:
INSERT INTO shipping_details (order_id, address) SELECT order_id, address FROM orders WHERE order_id = ?
the corresponding
db.query.summary
isINSERT shipping_details SELECT orders
.Query that performs an operation that’s applied to multiple collections:
SELECT * FROM songs, artists WHERE songs.artist_id == artists.id
the corresponding
db.query.summary
isSELECT songs artists
.Query that performs an operation on an anonymous table:
SELECT order_date FROM (SELECT * FROM orders o JOIN customers c ON o.customer_id = c.customer_id)
the corresponding
db.query.summary
isSELECT SELECT orders customers
.Query that performs an operation on multiple collections with double-quotes or other punctuation:
SELECT * FROM "song list", 'artists'
the corresponding
db.query.summary
isSELECT "song list" 'artists'
.Stored procedure is executed using a convenience API such as one available in JDBC:
connection.prepareCall("{call some_stored_procedure}");
the corresponding
db.query.summary
iscall some_stored_procedure
,db.query.text
is not populated. Note thatCALL
is the SQL standard keyword to invoke a stored procedure.Stored procedure is executed using Microsoft SQL Server driver’s convenience API Microsoft.Data.SqlClient:
var command = new SqlCommand(); command.CommandType = CommandType.StoredProcedure; command.CommandText = "some_stored_procedure";
the corresponding
db.query.summary
isEXECUTE some_stored_procedure
,db.query.text
is not populated. Note that Microsoft SQL Server does not support the SQL StandardCALL
keyword, but uses insteadEXECUTE
to invoke a stored procedure.
Semantic conventions for individual database systems or specialized instrumentations
MAY specify a different db.query.summary
format as long as produced summary remains
relatively short and its cardinality remains low comparing to the db.query.text
.
Semantic conventions for specific database technologies
More specific Semantic Conventions are defined for the following database technologies:
- AWS DynamoDB: Semantic Conventions for AWS DynamoDB.
- Cassandra: Semantic Conventions for Cassandra.
- Azure Cosmos DB: Semantic Conventions for Azure Cosmos DB.
- CouchDB: Semantic Conventions for CouchDB.
- Elasticsearch: Semantic Conventions for Elasticsearch.
- HBase: Semantic Conventions for HBase.
- MongoDB: Semantic Conventions for MongoDB.
- Microsoft SQL Server: Semantic Conventions for Microsoft SQL Server.
- Redis: Semantic Conventions for Redis.
- SQL: Semantic Conventions for SQL databases.
Feedback
Was this page helpful?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!