Prometheus Alerts Runbooks

Ви переглядаєте англійську версію сторінки, тому що її ще не було повністю перекладеною українською. Бажаєте допомогти? Дивіться як взяти Участь.

Manager Rules

ReconcileErrors

MeaningThe OpenTelemetry Operator cannot succeed in the reconciliation step, probably because of a misconfigured OpenTelemetryCollector.
ImpactNo impact on already running deployments or new correct ones.
DiagnosisCheck manager logs for reasons why this might happen.
MitigationFind out which OpenTelemetryCollector is causing the errors and fix the config.

WorkqueueDepth

MeaningThe working queue for the operator is larger than 0.
ImpactNo impact if the queue depth reverts to 0 quickly. More investigation is needed if the problem persists.
DiagnosisCheck manager logs for reasons why this might happen.
MitigationThis could be caused by many errors. Act based on what the logs are showing.