Excluding unwanted paths at the input level keeps Filebeat from reopening rotated, compressed, archived, or duplicate log files that waste ingest capacity and clutter downstream searches. Narrow scanner exclusions are often the cleanest way to keep one broad log pattern without shipping the files that do not belong in the pipeline.
Each filestream input expands its paths globs, scans the matching locations, and starts harvesters only for files that survive the scanner rules. The current exclusion key for filestream is prospector.scanner.exclude_files, which accepts regular expressions, so one input can watch a pattern such as /var/log/app/*.log* while ignoring specific suffixes or absolute path prefixes.
Current Elastic docs keep plain exclude_files for the deprecated log input, and Filebeat 9.x refuses legacy log or container inputs unless allow_deprecated_use: true is set in that input. Filebeat 9.x also defaults filestream to fingerprint-based file identity, which can delay harvesting tiny validation files until they reach 1024 bytes, so exclusion rules should be tested with realistic log files and confirmed in the service logs after restart.
$ sudo cp /etc/filebeat/filebeat.yml /etc/filebeat/filebeat.yml.bak
Keep the backup until the updated input has passed a config test and the service is healthy again.
$ sudo nano /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: filestream
id: system-logs
enabled: true
paths:
- /var/log/app/*.log*
prospector.scanner.exclude_files:
- '\.gz$'
- '\.1$'
- '^/var/log/app/archive/'
Keep prospector.scanner.exclude_files inside the target input, not at the top level. Use ^ when excluding an absolute path prefix so the regular expression does not match an unintended substring later in the path.
On current Filebeat releases, the legacy log input still uses exclude_files instead, but that input is deprecated and disabled by default in 9.x unless allow_deprecated_use: true is explicitly set.
$ find /var/log/app -maxdepth 2 -type f | sort /var/log/app/app.log /var/log/app/app.log.1 /var/log/app/app.log.gz /var/log/app/archive/app-2026-04-01.log
Exclusions are evaluated after the paths glob finds candidate files, so one broad glob plus a narrow exclude list is usually easier to maintain than several overlapping inputs.
$ sudo filebeat test config -c /etc/filebeat/filebeat.yml Config OK
Related: How to test a Filebeat configuration
$ sudo systemctl restart filebeat
$ sudo systemctl is-active filebeat active
$ sudo journalctl -u filebeat --no-pager --lines=40
Apr 02 19:45:16 loghost filebeat[34]: {"log.level":"info","@timestamp":"2026-04-02T11:45:16.331Z","log.logger":"crawler","message":"Loading Inputs: 1","service.name":"filebeat","ecs.version":"1.6.0"}
Apr 02 19:45:16 loghost filebeat[34]: {"log.level":"info","@timestamp":"2026-04-02T11:45:16.334Z","log.logger":"crawler","message":"Loading and starting Inputs completed. Enabled inputs: 1","service.name":"filebeat","ecs.version":"1.6.0"}
Apr 02 19:45:16 loghost filebeat[34]: {"log.level":"info","@timestamp":"2026-04-02T11:45:16.334Z","log.logger":"input.filestream","message":"Input 'filestream' starting","service.name":"filebeat","id":"system-logs","ecs.version":"1.6.0"}
##### snipped #####
If a tiny validation file does not get harvested, look for the current 9.x fingerprint warning about files smaller than 1024 bytes before assuming the exclusion regex is wrong.