Elastic Stack (ELK): Log Analysis with Elasticsearch

On production servers, logs live everywhere: app, nginx, systemd, database, kernel. When something breaks, SSHing into 20 servers and running grep is not sustainable. The ELK Stack — Elasticsearch (search), Logstash (transform), Kibana (UI) — collects every log centrally, indexes it, and makes it searchable in seconds.

Components

Filebeat: lightweight log shipper running on every server
Logstash: parse, filter, enrich (optional — Filebeat can ship straight to Elastic)
Elasticsearch: index, search, storage
Kibana: UI — search, dashboards, alerts

Docker Compose Installation

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=true
      - ELASTIC_PASSWORD=changeme
      - ES_JAVA_OPTS=-Xms1g -Xmx1g
    volumes:
      - es-data:/usr/share/elasticsearch/data
    ports: ['9200:9200']

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=changeme
    ports: ['5601:5601']
    depends_on: [elasticsearch]

volumes:
  es-data:

Log Shipping with Filebeat

# /etc/filebeat/filebeat.yml (on each application server)
filebeat.inputs:
  - type: filestream
    id: nginx-access
    paths:
      - /var/log/nginx/access.log
    parsers:
      - ndjson: {}
    fields:
      service: nginx
      type: access

  - type: filestream
    id: app-log
    paths:
      - /var/log/myapp/*.log
    fields:
      service: myapp

  - type: journald
    id: systemd
    include_matches.match:
      - _SYSTEMD_UNIT=keydal.service

processors:
  - add_host_metadata: {}
  - add_cloud_metadata: {}

output.elasticsearch:
  hosts: ['https://es.example.com:9200']
  username: elastic
  password: changeme
  index: "logs-%{[fields.service]}-%{+yyyy.MM.dd}"

systemctl enable --now filebeat
filebeat test config
filebeat test output

Transforming Logs with Logstash

Write a Logstash pipeline to parse and enrich raw logs.

# /etc/logstash/conf.d/nginx.conf
input {
  beats {
    port => 5044
  }
}

filter {
  if [service] == "nginx" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    date {
      match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    }
    geoip {
      source => "clientip"
    }
    useragent {
      source => "agent"
      target => "ua"
    }
    mutate {
      convert => { "response" => "integer" "bytes" => "integer" }
    }
  }
}

output {
  elasticsearch {
    hosts => ["http://elasticsearch:9200"]
    index => "nginx-%{+YYYY.MM.dd}"
  }
}

Searching in Kibana

# KQL (Kibana Query Language)
service: "nginx" AND response >= 500
service: "nginx" AND response: (500 or 502 or 503 or 504)
request: "*wp-admin*"
clientip: "1.2.3.4"
NOT status: (200 or 304)
response_time > 1000

# Lucene (more powerful)
message:error OR level:ERROR

Index Lifecycle Management (ILM)

After 30 days logs take up a lot of space. ILM policies automate rollover, downgrade and delete.

PUT _ilm/policy/logs-policy
{
  "policy": {
    "phases": {
      "hot":   { "actions": { "rollover": { "max_size": "50gb", "max_age": "7d" } } },
      "warm":  { "min_age": "7d",  "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } },
      "cold":  { "min_age": "30d", "actions": { "allocate": { "number_of_replicas": 0 } } },
      "delete":{ "min_age": "90d", "actions": { "delete": {} } }
    }
  }
}

Alerting

In Kibana, go to Stack Management → Alerts to define rules like "notify Slack when the 5xx rate in the last 5 minutes exceeds 5%". Threshold, anomaly detection and index patterns are all supported.

Performance Notes

Give Elasticsearch plenty of RAM — JVM heap half of total RAM, max 31 GB
SSD is mandatory
Shard size: 20-40 GB is the sweet spot
On a large cluster, separate master/data/ingest nodes
Use index templates to tune shards/replicas

Alternatives

Grafana Loki: much lighter than ELK, label-based (no full index). Integrates with Grafana
OpenSearch: the AWS fork of Elasticsearch — useful if licensing is a concern
Datadog, New Relic, Splunk: managed SaaS, no setup but expensive

Modern Web Hosting and Server Infrastructure

A performant web hosting service rests on three infrastructure decisions: NVMe SSD disks (4-6× IOPS over SATA SSD), LiteSpeed Web Server or Nginx + LSCache (9× request capacity over Apache) and CloudLinux + Imunify360 isolation. The hosting provider's control panel (cPanel, Plesk, DirectAdmin), daily backup policy, data center location and support response time make a big difference too. Turkish locations give low latency to local visitors, while Hetzner Frankfurt or OVH Roubaix suit global traffic. As your site grows, transitioning from shared hosting to VPS to dedicated server scales CPU/RAM/disk to your needs.

Centralized logging infrastructure

Reach out to KEYDAL for ELK Stack or Loki setup, log parsing and alerting. Contact us

Readers of this article also read these

hosting 8 min

Elastic Stack (ELK): Log Analysis with Elasticsearch, Logstash and Kibana