Prometheus Alerts Runbooks

You are viewing the English version of this page because it has not yet been translated. Interested in helping out? See Contributing.

Manager Rules

ReconcileErrors

MeaningThe OpenTelemetry Operator cannot succeed in the reconciliation step, probably because of a misconfigured OpenTelemetryCollector.
ImpactNo impact on already running deployments or new correct ones.
DiagnosisCheck manager logs for reasons why this might happen.
MitigationFind out which OpenTelemetryCollector is causing the errors and fix the config.

WorkqueueDepth

MeaningThe working queue for the operator is larger than 0.
ImpactNo impact if the queue depth reverts to 0 quickly. More investigation is needed if the problem persists.
DiagnosisCheck manager logs for reasons why this might happen.
MitigationThis could be caused by many errors. Act based on what the logs are showing.