Tail sampling in the OpenTelemetry Collector keeps trace export decisions inside the Collector until enough spans have arrived to match policies. It reduces trace volume while retaining traces that carry error status, exceed a latency threshold, or match attributes that matter for investigation.
The tail_sampling processor is declared under processors and only runs when the traces pipeline lists it. decision_wait controls how long the processor waits before deciding, and num_traces controls how many traces it can keep in memory while those decisions are pending.
Run tail sampling where all spans for the same trace arrive at the same Collector instance. Put context-dependent processors such as k8sattributes before tail_sampling, keep batch after it, and use a local debug exporter only long enough to prove the policy before switching back to the normal trace exporter.
Steps to configure tail sampling in the OpenTelemetry Collector:
- Open the active Collector configuration file.
$ sudoedit /etc/otelcol-contrib/config.yaml
The default path can be /etc/otelcol/config.yaml, /etc/otelcol-contrib/config.yaml, or a file passed with --config depending on the Collector distribution and service wrapper.
- Add the tail_sampling processor and activate it in the traces pipeline.
receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 processors: tail_sampling: decision_wait: 10s num_traces: 50000 expected_new_traces_per_sec: 1000 policies: - name: keep-error-traces type: status_code status_code: status_codes: [ERROR] - name: keep-slow-traces type: latency latency: threshold_ms: 500 batch: timeout: 1s send_batch_size: 10 exporters: debug: verbosity: detailed service: pipelines: traces: receivers: [otlp] processors: [tail_sampling, batch] exporters: [debug]The debug exporter and small batch size keep the smoke test local and quick. Replace debug and restore production batch values after the policies are verified.
verbosity: detailed can print span names, attributes, and service names into Collector logs. Use it only in a controlled test environment.
- Validate the Collector configuration.
$ otelcol-contrib validate --config /etc/otelcol-contrib/config.yaml
No output with a zero exit status means the file parsed and the configured components exist in that Collector distribution.
- Start the Collector with the updated file.
$ otelcol-contrib --config /etc/otelcol-contrib/config.yaml 2026-06-18T06:33:28.159Z info Starting GRPC server endpoint="[::]:4317" 2026-06-18T06:33:28.159Z info Everything is ready. Begin running and processing data.
For a packaged service, restart the unit that owns the process, such as otelcol or otelcol-contrib. For a container, recreate the container with the updated mounted configuration file.
- Send a fast successful trace that should not match either policy.
$ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-ok --status-code Ok --span-duration 100ms traces generated traces=1
Use any OTLP client that can set span status and duration if telemetrygen is not available.
- Send an error trace that should match the status_code policy.
$ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-error --status-code Error --span-duration 100ms traces generated traces=1
- Send a slow successful trace that should match the latency policy.
$ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-slow --status-code Ok --span-duration 750ms traces generated traces=1
The latency policy samples traces whose duration is greater than threshold_ms. Set the threshold from real service latency targets, not from a test-only value.
- Check the Collector output after decision_wait and the batch timeout have passed.
2026-06-18T06:34:00.180Z info Traces exporter=debug resource_spans=1 spans=2 Resource service.name=checkout-error Status code : Error ##### snipped ##### 2026-06-18T06:34:02.185Z info Traces exporter=debug resource_spans=1 spans=2 Resource service.name=checkout-slow Status code : Ok ##### snipped #####
checkout-ok is absent because the trace was neither slow nor marked as an error. With a production exporter, check the destination backend for the same sampled service names or trace IDs instead of leaving debug enabled.
- Replace the debug exporter with the normal trace exporter after the smoke test.
exporters: otlp: endpoint: otel-gateway.example.com:4317 service: pipelines: traces: receivers: [otlp] processors: [tail_sampling, batch] exporters: [otlp]When multiple tail-sampling Collector instances sit behind a gateway or load balancer, route all spans for the same trace ID to the same instance before tail_sampling runs.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.