Skipping the header row keeps CSV column names out of the destination index, which prevents misleading hits, broken aggregations, and one-off documents whose values are only the field labels.
The csv filter can learn field names from the first row it sees when autodetect_column_names is enabled, and skip_header then drops rows that exactly match that header instead of indexing them as events. Subsequent rows are parsed with the detected column names, so this works best when one filter instance handles one CSV schema.
Current Elastic documentation still requires the pipeline that runs this csv filter to use a single worker for header autodetection and skipping to behave correctly. With the file input, start_position only affects files that do not already have a recorded sincedb offset, so a file that has already been read may need a new filename or a carefully reset sincedb_path before Logstash sees the header again.
filter {
csv {
autodetect_column_names => true
skip_header => true
}
}
If you enable skip_header without autodetect_column_names, define columns explicitly. Repeated rows that exactly match the configured or autodetected header values are skipped too.
The first event seen by this filter becomes the header definition. If the pipeline reads multiple CSV layouts, keep each schema in its own filter instance or dedicated pipeline.
With the file input, start_position only affects first contact for files without an existing sincedb record. If Logstash has already advanced past the header, ingest a new file or reset that input's sincedb state carefully before expecting the header row to be skipped.
pipeline.workers: 1
If this CSV flow has its own entry in /etc/logstash/pipelines.yml, set pipeline.workers there instead of lowering workers for every pipeline on the host.
Reducing workers for the default main pipeline can lower throughput for unrelated pipelines on the same Logstash instance.
$ sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash --path.data /tmp/logstash-configtest --config.test_and_exit Using bundled JDK: /usr/share/logstash/jdk Configuration OK
Run the test as the logstash user so the command uses the same permissions model as the service.
$ sudo systemctl restart logstash
Restarting Logstash briefly interrupts ingestion while the pipeline reloads.
$ sudo systemctl --no-pager status logstash
● logstash.service - logstash
Loaded: loaded (/usr/lib/systemd/system/logstash.service; enabled; preset: enabled)
Active: active (running) since Tue 2026-04-07 08:18:42 UTC; 6s ago
##### snipped #####
$ curl -sG 'http://elasticsearch.example.net:9200/users-*/_count?pretty' \
--data-urlencode 'q=name:"name" AND email:"email" AND role:"role"'
{
"count" : 0,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
}
}
Replace the index pattern and the field/value pairs with the actual header labels from the CSV being ingested.
Header skipping only affects future events. Delete or reindex any previously ingested header documents separately if they already exist.