Grok parsing turns unstructured log lines into named fields so filtering, aggregations, and alerting in Elasticsearch stay reliable as log volume and sources grow.
In a Logstash pipeline, the grok filter applies regex patterns to the message field and writes captured values back into the event, including reusable built-ins like COMBINEDAPACHELOG for common web-server formats. Failed matches can be tagged for triage, and parsed fields can be converted to numeric types for correct sorting and aggregation.
Because grok is regex-based, overly broad patterns (especially early GREEDYDATA captures) can waste CPU and hide parsing drift until dashboards break. When the default pipeline loads multiple files from /etc/logstash/conf.d, config is merged in lexical order, so numbered prefixes control filter order and reduce surprises after edits.
Steps to parse logs with grok in Logstash:
- Add a grok filter to the pipeline configuration in /etc/logstash/conf.d/20-grok.conf.
input { file { path => "/var/lib/logstash/examples/grok.log" start_position => "beginning" sincedb_path => "/var/lib/logstash/sincedb-grok" } } filter { if [log][file][path] == "/var/lib/logstash/examples/grok.log" { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } tag_on_failure => ["_grok_parse_failure"] } if [bytes] and [bytes] != "-" { mutate { convert => { "bytes" => "integer" } } } mutate { convert => { "response" => "integer" } } } } output { if [log][file][path] == "/var/lib/logstash/examples/grok.log" { elasticsearch { hosts => ["http://elasticsearch.example.net:9200"] index => "app-grok-%{+YYYY.MM.dd}" } } }Store grok-only examples in a dedicated file and index so field parsing can be verified without interfering with other pipelines.
- Test the pipeline configuration for errors.
$ sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash --path.data /tmp/logstash-configtest --config.test_and_exit Configuration OK
- Restart the Logstash service to apply the grok filter.
$ sudo systemctl restart logstash
A restart briefly stops ingestion and may drop in-flight events that are still buffered in memory.
- Verify parse failures are not being tagged in Elasticsearch.
$ curl -s -G "http://elasticsearch.example.net:9200/app-grok-*/_search" \ --data-urlencode "q=tags:_grok_parse_failure" \ --data-urlencode "size=0" \ --data-urlencode "filter_path=hits.total" \ --data-urlencode "pretty" { "hits" : { "total" : { "value" : 0, "relation" : "eq" } } } - Verify parsed fields are present by fetching a recent matching event from Elasticsearch.
$ curl -s -G "http://elasticsearch.example.net:9200/app-grok-*/_search" \ --data-urlencode "size=1" \ --data-urlencode "sort=@timestamp:desc" \ --data-urlencode "filter_path=hits.hits._source.source.address,hits.hits._source.http.request.method,hits.hits._source.url.original,hits.hits._source.http.version,hits.hits._source.http.response.status_code,hits.hits._source.http.response.body.bytes,hits.hits._source.http.request.referrer,hits.hits._source.user_agent.original" \ --data-urlencode "pretty" { "hits" : { "hits" : [ { "_source" : { "url" : { "original" : "/grok-demo" }, "source" : { "address" : "203.0.113.11" }, "http" : { "version" : "1.1", "request" : { "referrer" : "https://www.example.net/", "method" : "GET" }, "response" : { "body" : { "bytes" : 512 }, "status_code" : 200 } }, "user_agent" : { "original" : "Mozilla/5.0" } } } ] } }
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
