Tail-Based Sampling with service.criticality
Vous consultez la version anglaise de cette page car elle n’a pas encore été entièrement traduite. Vous souhaitez contribuer ? Voir Contribuer.
This example demonstrates how to use the
service.criticality resource
attribute for intelligent tail-based sampling decisions in the OpenTelemetry
Collector.
The demo application assigns a service.criticality value to each service,
classifying them by operational importance:
| Criticality | Sampling Rate | Services |
|---|---|---|
critical | 100% | payment, checkout, frontend, frontend-proxy |
high | 50% | cart, product-catalog, currency, shipping |
medium | 10% | recommendation, ad, product-reviews, email |
low | 1% | accounting, fraud-detection, image-provider, load-generator, quote, flagd, flagd-ui, Kafka |
Collector Configuration
To enable tail-based sampling, add the following to your
otelcol-config-extras.yml:
processors:
tail_sampling:
decision_wait: 10s
num_traces: 100000
expected_new_traces_per_sec: 1000
policies:
# Policy 1: Always sample critical services (100%)
- name: critical-services-always-sample
type: string_attribute
string_attribute:
key: service.criticality
values:
- critical
enabled_regex_matching: false
invert_match: false
# Policy 2: Sample 50% of high-criticality services
- name: high-criticality-probabilistic
type: and
and:
and_sub_policy:
- name: is-high-criticality
type: string_attribute
string_attribute:
key: service.criticality
values:
- high
- name: probabilistic-50
type: probabilistic
probabilistic:
sampling_percentage: 50
# Policy 3: Sample 10% of medium-criticality services
- name: medium-criticality-probabilistic
type: and
and:
and_sub_policy:
- name: is-medium-criticality
type: string_attribute
string_attribute:
key: service.criticality
values:
- medium
- name: probabilistic-10
type: probabilistic
probabilistic:
sampling_percentage: 10
# Policy 4: Sample 1% of low-criticality services
- name: low-criticality-probabilistic
type: and
and:
and_sub_policy:
- name: is-low-criticality
type: string_attribute
string_attribute:
key: service.criticality
values:
- low
- name: probabilistic-1
type: probabilistic
probabilistic:
sampling_percentage: 1
# Policy 5: Always sample error traces regardless of criticality
- name: errors-always-sample
type: status_code
status_code:
status_codes:
- ERROR
# Policy 6: Always sample slow traces from critical/high services
- name: slow-critical-traces
type: and
and:
and_sub_policy:
- name: is-critical-or-high
type: string_attribute
string_attribute:
key: service.criticality
values:
- critical
- high
- name: is-slow
type: latency
latency:
threshold_ms: 5000
service:
pipelines:
traces:
receivers: [otlp]
processors: [resourcedetection, memory_limiter, transform, tail_sampling]
exporters: [otlp, debug, spanmetrics]
How It Works
The tail-sampling processor evaluates completed traces against the configured policies. A trace is sampled if any policy matches:
- Critical services are always sampled to ensure full visibility into payment flows, checkout, and user-facing services.
- High-criticality services are sampled at 50%, balancing observability with data volume.
- Medium and low-criticality services are progressively sampled at lower rates to reduce noise from less critical paths.
- Errors are always captured regardless of service criticality, ensuring no issues go unnoticed.
- Slow traces (>5s) from critical and high-criticality services are always sampled to help identify performance bottlenecks.
Feedback
Cette page est-elle utile?
Thank you. Your feedback is appreciated!
Please let us know how we can improve this page. Your feedback is appreciated!