Tail sampling in the OpenTelemetry Collector keeps trace export decisions inside the Collector until enough spans have arrived to match policies. It reduces trace volume while retaining traces that carry error status, exceed a latency threshold, or match attributes that matter for investigation.

The tail_sampling processor is declared under processors and only runs when the traces pipeline lists it. decision_wait controls how long the processor waits before deciding, and num_traces controls how many traces it can keep in memory while those decisions are pending.

Run tail sampling where all spans for the same trace arrive at the same Collector instance. Put context-dependent processors such as k8sattributes before tail_sampling, keep batch after it, and use a local debug exporter only long enough to prove the policy before switching back to the normal trace exporter.

Steps to configure tail sampling in the OpenTelemetry Collector:

  1. Open the active Collector configuration file.
    $ sudoedit /etc/otelcol-contrib/config.yaml

    The default path can be /etc/otelcol/config.yaml, /etc/otelcol-contrib/config.yaml, or a file passed with --config depending on the Collector distribution and service wrapper.

  2. Add the tail_sampling processor and activate it in the traces pipeline.
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: 0.0.0.0:4317
    
    processors:
      tail_sampling:
        decision_wait: 10s
        num_traces: 50000
        expected_new_traces_per_sec: 1000
        policies:
          - name: keep-error-traces
            type: status_code
            status_code:
              status_codes: [ERROR]
          - name: keep-slow-traces
            type: latency
            latency:
              threshold_ms: 500
      batch:
        timeout: 1s
        send_batch_size: 10
    
    exporters:
      debug:
        verbosity: detailed
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [tail_sampling, batch]
          exporters: [debug]

    The debug exporter and small batch size keep the smoke test local and quick. Replace debug and restore production batch values after the policies are verified.

    verbosity: detailed can print span names, attributes, and service names into Collector logs. Use it only in a controlled test environment.

  3. Validate the Collector configuration.
    $ otelcol-contrib validate --config /etc/otelcol-contrib/config.yaml

    No output with a zero exit status means the file parsed and the configured components exist in that Collector distribution.

  4. Start the Collector with the updated file.
    $ otelcol-contrib --config /etc/otelcol-contrib/config.yaml
    2026-06-18T06:33:28.159Z info Starting GRPC server endpoint="[::]:4317"
    2026-06-18T06:33:28.159Z info Everything is ready. Begin running and processing data.

    For a packaged service, restart the unit that owns the process, such as otelcol or otelcol-contrib. For a container, recreate the container with the updated mounted configuration file.

  5. Send a fast successful trace that should not match either policy.
    $ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-ok --status-code Ok --span-duration 100ms
    traces generated  traces=1

    Use any OTLP client that can set span status and duration if telemetrygen is not available.

  6. Send an error trace that should match the status_code policy.
    $ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-error --status-code Error --span-duration 100ms
    traces generated  traces=1
  7. Send a slow successful trace that should match the latency policy.
    $ telemetrygen traces --otlp-endpoint 127.0.0.1:4317 --otlp-insecure --traces 1 --child-spans 0 --service checkout-slow --status-code Ok --span-duration 750ms
    traces generated  traces=1

    The latency policy samples traces whose duration is greater than threshold_ms. Set the threshold from real service latency targets, not from a test-only value.

  8. Check the Collector output after decision_wait and the batch timeout have passed.
    2026-06-18T06:34:00.180Z info Traces exporter=debug resource_spans=1 spans=2
    Resource service.name=checkout-error
    Status code    : Error
    ##### snipped #####
    2026-06-18T06:34:02.185Z info Traces exporter=debug resource_spans=1 spans=2
    Resource service.name=checkout-slow
    Status code    : Ok
    ##### snipped #####

    checkout-ok is absent because the trace was neither slow nor marked as an error. With a production exporter, check the destination backend for the same sampled service names or trace IDs instead of leaving debug enabled.

  9. Replace the debug exporter with the normal trace exporter after the smoke test.
    exporters:
      otlp:
        endpoint: otel-gateway.example.com:4317
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [tail_sampling, batch]
          exporters: [otlp]

    When multiple tail-sampling Collector instances sit behind a gateway or load balancer, route all spans for the same trace ID to the same instance before tail_sampling runs.