# Troubleshooting

> Troubleshooting OBI common issues and errors

---

LLMS index: [llms.txt](/llms.txt)

---

On this page, you can learn how to diagnose and resolve common OBI errors and
issues.

## Troubleshooting tools

OBI provides a variety of tools and configuration options to help diagnose and
troubleshoot issues.

### Detailed logging

You can increase the logging verbosity of OBI by setting the `log_level`
configuration or the `OTEL_EBPF_LOG_LEVEL` environment variable to `debug`. This
provides more detailed logs that may help in diagnosing issues.

To enable logging from the BPF programs, set the `ebpf.bpf_debug` configuration
or the `OTEL_EBPF_BPF_DEBUG` environment variable to `true`. **Use this only for
debugging**, as it can generate a significant number of logs.

### Configuration logging

By default, OBI merges its configuration from three different sources, from
least to most priority:

- Built-in default configuration
- Configuration file, provided using the `--config` flag or
  `OTEL_EBPF_CONFIG_PATH`
- Environment variables, usually starting with `OTEL_EBPF_`

It is often helpful to view the final merged configuration. Using the
`log_config` configuration value (or `OTEL_EBPF_LOG_CONFIG` environment
variable), you can instruct OBI to log the final configuration at startup.

`log_config` supports the following values:

- `yaml` — logs the final configuration in YAML format; best for human
  readability since it matches the config file structure
- `json` — logs the final configuration in JSON format; best for log shippers
  since it is a single structured line

### Internal metrics

You can configure and use [OBI internal metrics](../metrics/#internal-metrics)
to monitor performance and internal state.

To turn on internal metrics, configure `internal_metrics.exporter` with one of
the following values:

- `none` (default): disables internal metrics
- `prometheus`: exports internal metrics in Prometheus format via an HTTP server
- `otlp`: exports internal metrics via an OTLP exporter

### Debug traces exporter

To debug the raw trace spans generated by OBI, you can set the
`otel_traces_exporter.protocol` configuration value or the
`OTEL_EXPORTER_OTLP_TRACES_PROTOCOL` environment variable to `debug`. This logs
the raw trace spans to the console in a human-readable format, matching the OTel
Collector debug exporter with `verbosity: detailed`. Example spans to the
console look like this:

```text
Traces	{"resource spans": 1, "spans": 1}
ResourceSpans #0
Resource SchemaURL:
Resource attributes:
     -> service.name: Str(flagd)
     -> telemetry.sdk.language: Str(go)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> telemetry.distro.name: Str(opentelemetry-ebpf-instrumentation)
     -> telemetry.sdk.version: Str(main)
     -> host.name: Str(flagd-5cccb4c4f5-sfkcm)
     -> os.type: Str(linux)
     -> service.namespace: Str(opentelemetry-demo)
     -> k8s.owner.name: Str(flagd)
     -> k8s.kind: Str(Deployment)
     -> k8s.replicaset.name: Str(flagd-5cccb4c4f5)
     -> k8s.pod.name: Str(flagd-5cccb4c4f5-sfkcm)
     -> k8s.container.name: Str(flagd)
     -> k8s.deployment.name: Str(flagd)
     -> service.version: Str(2.0.2)
     -> k8s.namespace.name: Str(default)
     -> otel.library.name: Str(go.opentelemetry.io/obi)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope
Span #0
    Trace ID       : 63a2723a58e0033170e58b1ff27ef03d
    Parent ID      :
    ID             : fab47609b60cc4e0
    Name           : /opentelemetry.proto.collector.metrics.v1.MetricsService/Export
    Kind           : Client
    Start time     : 2025-11-28 16:10:35.4241749 +0000 UTC
    End time       : 2025-11-28 16:10:35.42555658 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> rpc.method: Str(/opentelemetry.proto.collector.metrics.v1.MetricsService/Export)
     -> rpc.system: Str(grpc)
     -> rpc.grpc.status_code: Int(0)
     -> server.address: Str(otel-collector.default)
     -> peer.service: Str(otel-collector.default)
     -> server.port: Int(4317)
```

Starting with OBI v0.6.0, `telemetry.sdk.name` reflects the underlying SDK when
available, and OBI identifies itself using `telemetry.distro.name`.

### Performance profiler (pprof)

OBI can expose a `pprof` port to allow performance profiling. To enable it, set
the `profile_port` configuration value or the `OTEL_EBPF_PROFILE_PORT`
environment variable to the desired port.

This is an advanced use case and typically not required.

## Common OBI issues

This section covers how to resolve common OBI issues.

### Node.js services crash or become unresponsive when OBI is running

To enable better context propagation in Node.js applications, OBI injects custom
code to track the current execution context. It does so using the Node.js
inspector protocol and sends the `SIGUSR1` signal to the Node process to open
the inspector.

However, if the application defines its own `SIGUSR1` signal handler, it handles
OBI's signal in a custom way, which may cause crashes or unresponsiveness of the
targeted application. For example:

```javascript
process.on('SIGUSR1', () => {
  process.exit(0);
});
```

Or by using Node.js flags that register their own signal handling, such as:

```commandline
node --heapsnapshot-signal=SIGUSR1
```

**Solutions:**

- Use the `discovery` configuration to exclude specific Node.js applications
  from OBI tracking, preventing OBI from sending `SIGUSR1`.
- Disable Node.js context propagation entirely by setting `nodejs.enabled:false`
  in configuration file or environment variable
  `OTEL_EBPF_NODEJS_ENABLED=false`.

### ClickHouse instances crash when OBI is running

If you're running [Clickhouse](https://github.com/ClickHouse/ClickHouse) on the
same node with OBI, you might see ClickHouse crashing with logs such as:

```text
Application: Code: 246. DB::Exception: Calculated checksum of the executable (...) does not correspond to the reference checksum ...
```

The issue is likely caused by OBI attaching eBPF uprobes to the ClickHouse
binary.
[A relevant GitHub](https://github.com/ClickHouse/ClickHouse/issues/83637) issue
explains this behavior:

> When attaching a uprobe, the kernel will modify the target process memory to
> insert a trap instruction at the attachment address. This causes the
> ClickHouse binary checksum validation to fail during startup.

**Solution:**

Start ClickHouse with the
[skip_binary_checksum_checks](https://clickhouse.com/docs/operations/server-configuration-parameters/settings#skip_binary_checksum_checks)
flag

### Missing telemetry data for Go applications or TLS requests

If you are missing telemetry coming from Go applications or TLS requests (like
HTTPS communication), it might be due to insufficient privileges for attaching
uprobes. Due to some recent kernel security changes which were backported to
many older kernel versions, uprobes now require `CAP_SYS_ADMIN` capability. OBI
uses uprobes to instrument Golang applications and TLS requests, along with
other runtime/language specific instrumentations. If your OBI deployment
security configuration isn't using privileged operation (for example,
`privileged:true` or Docker and Kubernetes) or it doesn't provide
`CAP_SYS_ADMIN` as a security capability, you might not see some or all of your
telemetry.

To troubleshoot this issue, enable detailed OBI logging with
`OTEL_EBPF_LOG_LEVEL=debug`. If you see all the uprobe injections failing with
the error "setting uprobe (offset)..." then you are likely experiencing this
issue.

**Solutions:**

You can either:

- Run OBI as privileged.
- Add `CAP_SYS_ADMIN` to the list of capabilities in your deployment security
  configuration.

## Migration to v0.7.0: Network port guessing changes

OBI v0.7.0 introduces a breaking change: **network port guessing is now disabled
by default**. This change improves network metrics accuracy by not making
assumptions about unknown initiators in network flows.

### What changed

In v0.6.0 and earlier, OBI would attempt to guess which endpoint was the client
and which was the server in network flows where the initiator couldn't be
determined. This guessing was based on ordinal heuristics (typically assuming
the lower port number was the server and the higher port number was the client).

In v0.7.0, this guessing is disabled by default, which means:

- `client.port` and `server.port` attributes may be empty for flows where OBI
  cannot determine the initiator
- Network metrics will be more accurate but may lose information for unknown
  flows

### How to migrate

If you depend on the old behavior and want `client.port` and `server.port` to be
inferred even when the initiator is unknown, re-enable port guessing with
ordinal heuristics:

**YAML configuration:**

```yaml
network:
  guess_ports: ordinal
```

**Environment variable:**

```sh
OTEL_EBPF_NETWORK_GUESS_PORTS=ordinal
```

For more details, see the
[network configuration documentation](../network/config/).

### Recommendation

We recommend leaving port guessing disabled unless you have a specific use case
that requires it. The default behavior provides cleaner, more accurate network
metrics that are less prone to misclassification.
