Troubleshooting
Você está visualizando a versão em versão em inglês desta página porque ela ainda não foi traduzida. Possui interesse em ajudar? Veja como contribuir.
On this page, you can learn how to diagnose and resolve common OBI errors and issues.
Troubleshooting tools
OBI provides a variety of tools and configuration options to help diagnose and troubleshoot issues.
Detailed logging
You can increase the logging verbosity of OBI by setting the log_level
configuration or the OTEL_EBPF_LOG_LEVEL environment variable to debug. This
provides more detailed logs that may help in diagnosing issues.
To enable logging from the BPF programs, set the ebpf.bpf_debug configuration
or the OTEL_EBPF_BPF_DEBUG environment variable to true. Use this only for
debugging, as it can generate a significant number of logs.
Configuration logging
By default, OBI merges its configuration from three different sources, from least to most priority:
- Built-in default configuration
- Configuration file, provided using the
--configflag orOTEL_EBPF_CONFIG_PATH - Environment variables, usually starting with
OTEL_EBPF_
It is often helpful to view the final merged configuration. Using the
log_config configuration value (or OTEL_EBPF_LOG_CONFIG environment
variable), you can instruct OBI to log the final configuration at startup.
log_config supports the following values:
yaml— logs the final configuration in YAML format; best for human readability since it matches the config file structurejson— logs the final configuration in JSON format; best for log shippers since it is a single structured line
Internal metrics
You can configure and use OBI internal metrics to monitor performance and internal state.
To turn on internal metrics, configure internal_metrics.exporter with one of
the following values:
none(default): disables internal metricsprometheus: exports internal metrics in Prometheus format via an HTTP serverotlp: exports internal metrics via an OTLP exporter
Debug traces exporter
To debug the raw trace spans generated by OBI, you can set the
otel_traces_exporter.protocol configuration value or the
OTEL_EXPORTER_OTLP_TRACES_PROTOCOL environment variable to debug. This logs
the raw trace spans to the console in a human-readable format, matching the OTel
Collector debug exporter with verbosity: detailed. Example spans to the
console look like this:
Traces {"resource spans": 1, "spans": 1}
ResourceSpans #0
Resource SchemaURL:
Resource attributes:
-> service.name: Str(flagd)
-> telemetry.sdk.language: Str(go)
-> telemetry.sdk.name: Str(opentelemetry-ebpf-instrumentation)
-> telemetry.sdk.version: Str(main)
-> host.name: Str(flagd-5cccb4c4f5-sfkcm)
-> os.type: Str(linux)
-> service.namespace: Str(opentelemetry-demo)
-> k8s.owner.name: Str(flagd)
-> k8s.kind: Str(Deployment)
-> k8s.replicaset.name: Str(flagd-5cccb4c4f5)
-> k8s.pod.name: Str(flagd-5cccb4c4f5-sfkcm)
-> k8s.container.name: Str(flagd)
-> k8s.deployment.name: Str(flagd)
-> service.version: Str(2.0.2)
-> k8s.namespace.name: Str(default)
-> otel.library.name: Str(go.opentelemetry.io/obi)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope
Span #0
Trace ID : 63a2723a58e0033170e58b1ff27ef03d
Parent ID :
ID : fab47609b60cc4e0
Name : /opentelemetry.proto.collector.metrics.v1.MetricsService/Export
Kind : Client
Start time : 2025-11-28 16:10:35.4241749 +0000 UTC
End time : 2025-11-28 16:10:35.42555658 +0000 UTC
Status code : Unset
Status message :
Attributes:
-> rpc.method: Str(/opentelemetry.proto.collector.metrics.v1.MetricsService/Export)
-> rpc.system: Str(grpc)
-> rpc.grpc.status_code: Int(0)
-> server.address: Str(otel-collector.default)
-> peer.service: Str(otel-collector.default)
-> server.port: Int(4317)
Performance profiler (pprof)
OBI can expose a pprof port to allow performance profiling. To enable it, set
the profile_port configuration value or the OTEL_EBPF_PROFILE_PORT
environment variable to the desired port.
This is an advanced use case and typically not required.
Common OBI issues
This section covers how to resolve common OBI issues.
Node.js services crash or become unresponsive when OBI is running
To enable better context propagation in Node.js applications, OBI injects custom
code to track the current execution context. It does so using the Node.js
inspector protocol and sends the SIGUSR1 signal to the Node process to open
the inspector.
However, if the application defines its own SIGUSR1 signal handler, it handles
OBI’s signal in a custom way, which may cause crashes or unresponsiveness of the
targeted application. For example:
process.on('SIGUSR1', () => {
process.exit(0);
});
Or by using Node.js flags that register their own signal handling, such as:
node --heapsnapshot-signal=SIGUSR1
Solutions:
- Use the
discoveryconfiguration to exclude specific Node.js applications from OBI tracking, preventing OBI from sendingSIGUSR1. - Disable Node.js context propagation entirely by setting
nodejs.enabled:falsein configuration file or environment variableOTEL_EBPF_NODEJS_ENABLED=false.
ClickHouse instances crash when OBI is running
If you’re running Clickhouse on the same node with OBI, you might see ClickHouse crashing with logs such as:
Application: Code: 246. DB::Exception: Calculated checksum of the executable (...) does not correspond to the reference checksum ...
The issue is likely caused by OBI attaching eBPF uprobes to the ClickHouse binary. A relevant GitHub issue explains this behavior:
When attaching a uprobe, the kernel will modify the target process memory to insert a trap instruction at the attachment address. This causes the ClickHouse binary checksum validation to fail during startup.
Solution:
Start ClickHouse with the skip_binary_checksum_checks flag
Missing telemetry data for Go applications or TLS requests
If you are missing telemetry coming from Go applications or TLS requests (like
HTTPS communication), it might be due to insufficient privileges for attaching
uprobes. Due to some recent kernel security changes which were backported to
many older kernel versions, uprobes now require CAP_SYS_ADMIN capability. OBI
uses uprobes to instrument Golang applications and TLS requests, along with
other runtime/language specific instrumentations. If your OBI deployment
security configuration isn’t using privileged operation (for example,
privileged:true or Docker and Kubernetes) or it doesn’t provide
CAP_SYS_ADMIN as a security capability, you might not see some or all of your
telemetry.
To troubleshoot this issue, enable detailed OBI logging with
OTEL_EBPF_LOG_LEVEL=debug. If you see all the uprobe injections failing with
the error “setting uprobe (offset)…” then you are likely experiencing this
issue.
Solutions:
You can either:
- Run OBI as privileged.
- Add
CAP_SYS_ADMINto the list of capabilities in your deployment security configuration.
Feedback
Esta página foi útil?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!