Grok parsing turns unstructured log lines into named fields so filtering, aggregations, and alerting in Elasticsearch stay reliable as log volume and sources grow.

In a Logstash pipeline, the grok filter applies regex patterns to the message field and writes captured values back into the event, including reusable built-ins like COMBINEDAPACHELOG for common web-server formats. Failed matches can be tagged for triage, and parsed fields can be converted to numeric types for correct sorting and aggregation.

Because grok is regex-based, overly broad patterns (especially early GREEDYDATA captures) can waste CPU and hide parsing drift until dashboards break. When the default pipeline loads multiple files from /etc/logstash/conf.d, config is merged in lexical order, so numbered prefixes control filter order and reduce surprises after edits.

Steps to parse logs with grok in Logstash:

  1. Add a grok filter to the pipeline configuration in /etc/logstash/conf.d/20-grok.conf.
    input {
      file {
        path => "/var/lib/logstash/examples/grok.log"
        start_position => "beginning"
        sincedb_path => "/var/lib/logstash/sincedb-grok"
      }
    }
    
    filter {
      if [log][file][path] == "/var/lib/logstash/examples/grok.log" {
        grok {
          match => { "message" => "%{COMBINEDAPACHELOG}" }
          tag_on_failure => ["_grok_parse_failure"]
        }
    
        if [bytes] and [bytes] != "-" {
          mutate {
            convert => { "bytes" => "integer" }
          }
        }
    
        mutate {
          convert => { "response" => "integer" }
        }
      }
    }
    
    output {
      if [log][file][path] == "/var/lib/logstash/examples/grok.log" {
        elasticsearch {
          hosts => ["http://elasticsearch.example.net:9200"]
          index => "app-grok-%{+YYYY.MM.dd}"
        }
      }
    }

    Store grok-only examples in a dedicated file and index so field parsing can be verified without interfering with other pipelines.

  2. Test the pipeline configuration for errors.
    $ sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash --path.data /tmp/logstash-configtest --config.test_and_exit
    Configuration OK
  3. Restart the Logstash service to apply the grok filter.
    $ sudo systemctl restart logstash

    A restart briefly stops ingestion and may drop in-flight events that are still buffered in memory.

  4. Verify parse failures are not being tagged in Elasticsearch.
    $ curl -s -G "http://elasticsearch.example.net:9200/app-grok-*/_search" \
      --data-urlencode "q=tags:_grok_parse_failure" \
      --data-urlencode "size=0" \
      --data-urlencode "filter_path=hits.total" \
      --data-urlencode "pretty"
    {
      "hits" : {
        "total" : {
          "value" : 0,
          "relation" : "eq"
        }
      }
    }
  5. Verify parsed fields are present by fetching a recent matching event from Elasticsearch.
    $ curl -s -G "http://elasticsearch.example.net:9200/app-grok-*/_search" \
      --data-urlencode "size=1" \
      --data-urlencode "sort=@timestamp:desc" \
      --data-urlencode "filter_path=hits.hits._source.source.address,hits.hits._source.http.request.method,hits.hits._source.url.original,hits.hits._source.http.version,hits.hits._source.http.response.status_code,hits.hits._source.http.response.body.bytes,hits.hits._source.http.request.referrer,hits.hits._source.user_agent.original" \
      --data-urlencode "pretty"
    {
      "hits" : {
        "hits" : [
          {
            "_source" : {
              "url" : {
                "original" : "/grok-demo"
              },
              "source" : {
                "address" : "203.0.113.11"
              },
              "http" : {
                "version" : "1.1",
                "request" : {
                  "referrer" : "https://www.example.net/",
                  "method" : "GET"
                },
                "response" : {
                  "body" : {
                    "bytes" : 512
                  },
                  "status_code" : 200
                }
              },
              "user_agent" : {
                "original" : "Mozilla/5.0"
              }
            }
          }
        ]
      }
    }