Setup Grafana + Loki + Promtail

Monitoring Log Windows 11 di Fedora Server dengan VMware NAT

Grafana 12.2.0 Loki 3.1.1 Promtail 3.1.1 Fedora 42
Mulai Setup

Pendahuluan

Apa yang Akan Kita Bangun?

Setup monitoring log real-time untuk menarik log Event Viewer Windows 11 (Application & System) ke Grafana dashboard melalui Loki yang berjalan di Fedora Server VM.

Stack Teknologi

  • Grafana 12.2.0 - UI Dashboard & Visualisasi
  • Loki 3.1.1 - Log Aggregation & Storage
  • Promtail 3.1.1 - Log Shipper dari Windows
  • Fedora Server 42 - Host untuk Grafana & Loki
  • Docker Compose - Container Orchestration
  • VMware NAT - Network Mode

Manfaat

  • Monitor log Windows dari server remote
  • Query log dengan LogQL (filter error, warning, dll)
  • Alert otomatis (contoh: >10 error/hari)
  • Dashboard custom (table event, graph error rate)
  • Centralized logging untuk multiple Windows hosts

Prerequisites

  • Akses sudo di Fedora Server
  • Admin privileges di Windows 11
  • Internet stabil (download ~500MB)
  • Basic knowledge Linux CLI & PowerShell
  • VMware Workstation/Player installed

Durasi Estimasi: 45-60 menit

Level Difficulty: Intermediate

Arsitektur & Konsep

Diagram Arsitektur

┌─────────────────────────────────────────────────────┐
│                  Windows 11 Host                     │
│  ┌──────────────┐         ┌──────────────┐         │
│  │ Event Viewer │────────▶│   Promtail   │         │
│  │ (Application)│         │  (Port 9080) │         │
│  │   (System)   │         └──────┬───────┘         │
│  └──────────────┘                │                  │
│         IP: 192.168.1.80         │ Push Logs        │
└──────────────────────────────────┼──────────────────┘
                                   │ HTTP POST
                                   │ :3100/loki/api/v1/push
                    ┌──────────────▼──────────────┐
                    │      VMware NAT Bridge      │
                    │   (VMnet8: 192.168.93.0/24) │
                    └──────────────┬──────────────┘
                                   │
┌──────────────────────────────────▼──────────────────┐
│              Fedora Server 42 VM                     │
│  ┌────────────┐         ┌────────────┐             │
│  │    Loki    │◀────────│  Grafana   │             │
│  │ (Port 3100)│         │ (Port 3000)│             │
│  │   Docker   │         │Native (RPM)│             │
│  └────────────┘         └────────────┘             │
│         IP: 192.168.93.128                          │
└─────────────────────────────────────────────────────┘

Flow Data

  1. Windows Event Viewer generate logs (Application/System)
  2. Promtail scrape logs via Windows Events API
  3. Promtail push logs ke Loki (HTTP POST)
  4. Loki store logs & buat index
  5. Grafana query logs dari Loki (LogQL)
  6. User lihat dashboard di browser

Device & Environment Setup

Device Specifications

Windows 11 Host (Log Source)

Component Specification
OS Windows 11 Pro (Build 10.0.26100.6584)
CPU Intel Core i7 8th Gen+
RAM 16GB+
Storage SSD 512GB+ (5GB free)
Network Wi-Fi (192.168.1.80/24)

Fedora Server VM (Grafana + Loki)

Component Specification
OS Fedora Server 42 (Minimal)
vCPU 2 cores
RAM 4GB
Storage 50GB (Thin Provision, 10GB free)
Network VMware NAT (192.168.93.128/24)

Network Configuration (VMware NAT)

Kenapa NAT?

  • VM akses internet via host
  • Host akses VM via port forwarding
  • Lebih simple dari Bridged mode
  • Isolasi network lebih baik

Setup Port Forwarding:

  1. Buka VMware Workstation/Player
  2. EditVirtual Network Editor
  3. Pilih VMnet8 (NAT)
  4. Klik NAT Settings
  5. Klik Add untuk forward port
Host Port Guest IP Guest Port Description
3000 192.168.93.128 3000 Grafana UI
3100 192.168.93.128 3100 Loki API

Test Network Connectivity

Di Fedora VM:

ping -c 4 8.8.8.8          # Test internet
ip addr show ens160        # Cek IP VM

Di Windows Host:

ping 192.168.93.128        # Test koneksi ke VM
Test-NetConnection -ComputerName 192.168.93.128 -Port 3000
Test-NetConnection -ComputerName 192.168.93.128 -Port 3100

Server Setup: Fedora VM

Basic System Preparation

Update System

sudo dnf update -y
sudo dnf autoremove -y
sudo reboot

Login kembali sebagai user lab-stack (atau user sudo lainnya).

Install Essential Tools

sudo dnf install -y wget nano curl vim git firewalld
sudo systemctl start firewalld
sudo systemctl enable firewalld
sudo systemctl status firewalld

Configure Timezone (Optional)

sudo timedatectl set-timezone Asia/Jakarta
timedatectl  # Verify

Install Docker & Docker Compose

Add Docker Repository

sudo dnf config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo

Install Docker Engine

sudo dnf install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Start & Enable Docker

sudo systemctl start docker
sudo systemctl enable docker
sudo systemctl status docker

Add User to Docker Group

sudo usermod -aG docker $USER
newgrp docker  # Refresh group tanpa logout

Verify Installation

docker --version          # Expected: Docker version 20.10+
docker compose version    # Expected: Docker Compose v2.20+
docker run hello-world    # Test run container

SELinux Configuration (Fedora Specific)

# Allow Docker container manage cgroup
sudo setsebool -P container_manage_cgroup true

# Allow HTTP network connections
sudo setsebool -P httpd_can_network_connect 1

# Verify SELinux status
sudo getsebool container_manage_cgroup
sudo getsebool httpd_can_network_connect

Install Grafana (Native RPM)

Add Grafana Repository

sudo nano /etc/yum.repos.d/grafana.repo

Paste configuration:

[grafana]
name=grafana
baseurl=https://rpm.grafana.com
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://rpm.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt

Save: Ctrl+OEnterCtrl+X

Import GPG Key

sudo rpm --import https://rpm.grafana.com/gpg.key

Install Grafana

sudo dnf install grafana -y

Start & Enable Grafana

sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
sudo systemctl status grafana-server  # Must show "Active (running)"

Configure Firewall

sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --reload
sudo firewall-cmd --list-ports  # Verify 3000/tcp listed

Test Grafana Access

Dari Windows Browser:

http://192.168.93.128:3000

Default Login:

  • Username: admin
  • Password: admin
  • Ganti password saat first login

Install & Configure Loki with Docker Compose

Create Project Directory

mkdir -p ~/loki-stack && cd ~/loki-stack

Download Official Configuration Files

# Download Docker Compose file
wget https://raw.githubusercontent.com/grafana/loki/main/production/docker-compose.yaml -O docker-compose.yaml

# Download Loki configuration
wget https://raw.githubusercontent.com/grafana/loki/main/cmd/loki/loki-local-config.yaml -O loki-local-config.yaml

Edit Loki Configuration

nano loki-local-config.yaml

Configuration template:

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  instance_addr: 127.0.0.1
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2025-10-03  # Ganti dengan tanggal hari ini
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

storage_config:
  filesystem:
    directory: /loki/chunks

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h  # 7 days
  ingestion_rate_mb: 10
  ingestion_burst_size_mb: 20

Edit Docker Compose File

nano docker-compose.yaml

Modified configuration:

services:
  loki:
    image: grafana/loki:3.1.1
    container_name: loki
    ports:
      - "3100:3100"
    volumes:
      - ./loki-local-config.yaml:/etc/loki/local-config.yaml:ro
      - loki-storage:/loki
    command: -config.file=/etc/loki/local-config.yaml
    restart: unless-stopped
    networks:
      - loki-net

networks:
  loki-net:
    driver: bridge

volumes:
  loki-storage:
    driver: local

Fix SELinux Context

sudo chcon -Rt svirt_sandbox_file_t ~/loki-stack/

Start Loki Stack

docker compose up -d

Verify Loki Running

# Check container status
docker compose ps

# Check logs
docker compose logs loki

# Expected output:
# "Server listening on :3100"
# "Loki started"

Configure Firewall for Loki

sudo firewall-cmd --permanent --add-port=3100/tcp
sudo firewall-cmd --reload
sudo firewall-cmd --list-ports  # Verify 3100/tcp listed

Test Loki Endpoint

Dari Fedora VM:

curl http://localhost:3100/ready
# Expected: "ready"

curl http://localhost:3100/metrics
# Expected: Prometheus metrics output

Dari Windows Host:

Invoke-WebRequest -Uri http://192.168.93.128:3100/ready
# Expected: StatusCode 200

Target Setup: Windows 11

Prepare Windows Environment

Create Promtail Directory

Via PowerShell:

New-Item -Path "C:\promtail" -ItemType Directory -Force

Verify PowerShell Version

$PSVersionTable.PSVersion
# Expected: 5.1 or higher

Install & Configure Promtail

Download Promtail Binary

Download dari GitHub Releases: promtail-windows-amd64.exe.zip

Via PowerShell (Alternative):

cd C:\promtail
$url = "https://github.com/grafana/loki/releases/download/v3.1.1/promtail-windows-amd64.exe.zip"
Invoke-WebRequest -Uri $url -OutFile "promtail.zip"
Expand-Archive -Path "promtail.zip" -DestinationPath "." -Force
Remove-Item "promtail.zip"

Create Configuration File

Buka Notepad dan paste configuration berikut:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: C:\\promtail\\positions.yaml

clients:
  - url: http://192.168.93.128:3100/loki/api/v1/push
    batchwait: 5s
    batchsize: 102400

scrape_configs:
  - job_name: windows-application
    windows_events:
      locale: 1033
      use_incoming_timestamp: true
      exclude_event_data: false
      exclude_user_data: false
      bookmark_path: "C:\\promtail\\bookmark-application.xml"
      eventlog_name: "Application"
      xpath_query: '*'
      labels:
        logsource: windows-eventlog
        job: windows-application
        host: windows-11-main

  - job_name: windows-system
    windows_events:
      locale: 1033
      use_incoming_timestamp: true
      exclude_event_data: false
      exclude_user_data: false
      bookmark_path: "C:\\promtail\\bookmark-system.xml"
      eventlog_name: "System"
      xpath_query: '*'
      labels:
        logsource: windows-eventlog
        job: windows-system
        host: windows-11-main

CRITICAL NOTES:

  • IP Address: Ganti 192.168.93.128 dengan IP Fedora VM Anda
  • Path Escaping: Windows path MUST use \\ (double backslash)
  • Encoding: Save as UTF-8
  • Filename: promtail-config.yaml (di C:\promtail)

Test Promtail Configuration

PowerShell as Administrator:

cd C:\promtail
.\promtail-windows-amd64.exe -config.file=promtail-config.yaml --dry-run

Expected Output:

level=info msg="Clients configured"
level=info msg="Reloading configuration file" file=promtail-config.yaml

Run Promtail Manual Test

Command Prompt as Administrator:

cd C:\promtail
promtail-windows-amd64.exe -config.file=promtail-config.yaml --log.level=debug

Expected Output:

level=info msg="Starting Promtail"
level=info msg="Subscribed with handle id" handle_id=2
level=info msg="server listening on :9080"

Setup Promtail Auto-Start

Via Task Scheduler:

  1. Win + Rtaskschd.msc → Enter
  2. Right-click → Create Basic Task
  3. Name: Promtail Auto Start
  4. Trigger: At startup
  5. Action: Start a program
  6. Program: C:\promtail\promtail-windows-amd64.exe
  7. Arguments: -config.file=C:\promtail\promtail-config.yaml
  8. Start in: C:\promtail
  9. Check: Run with highest privileges
  10. Finish

Test Scheduled Task

# Run task manually
schtasks /run /tn "Promtail Auto Start"

# Check if running
Get-Process promtail* -ErrorAction SilentlyContinue

Integrasi & Testing

Add Loki Data Source to Grafana

  1. Buka browser: http://192.168.93.128:3000
  2. Login (admin/admin)
  3. MenuConnectionsData sources
  4. Add new data source
  5. Search: Loki → Select

Settings:

  • Name: Loki-Windows
  • URL: http://localhost:3100
  • Access: Server (default)
  • Timeout: 60s

Save & test button - Expected: ✅ "Data source is working"

Explore & Query Logs

Open Explore Interface:

  1. MenuExplore
  2. Data source: Select Loki-Windows

Basic LogQL Queries

Query 1: All Windows System Logs

{job="windows-system"}

Query 2: Filter Errors Only

{job="windows-system"} |= "error"

Query 3: Count Errors per Minute

sum(count_over_time({job="windows-system"} |= "error" [1m]))

Query 4: Multiple Jobs

{logsource="windows-eventlog"}

Create Basic Dashboard

  1. MenuDashboardsNew Dashboard
  2. Add visualization
  3. Data source: Loki-Windows

Panel 1: Recent Logs Table

  • Query: {logsource="windows-eventlog"}
  • Visualization: Logs
  • Panel title: Recent Windows Events

Panel 2: Error Rate Graph

  • Query: sum(rate({job=~"windows-.*"} |= "error" [5m]))
  • Visualization: Time series
  • Panel title: Error Rate (per 5 min)

End-to-End Testing

Generate System Event (Windows PowerShell):

# Generate network event
ping 192.168.93.128 -n 10

# Wait 30 seconds
Start-Sleep -Seconds 30

Check in Grafana Explore:

{job="windows-system"} |= "ping"

Expected: Lihat log ping events

Troubleshooting

Problem 1: Grafana - No Logs Appearing

Symptoms: Query returns 0 results, "No logs found" message

Solutions:

  1. Check Time Range: Set to "Last 5 minutes" atau "Last 15 minutes"
  2. Verify Data Source: Connections → Data sources → Loki-Windows → Test
  3. Check Loki Receiving Data:
    # Di Fedora VM
    curl http://localhost:3100/loki/api/v1/label
    # Expected: List of labels
  4. Generate New Events:
    ping 192.168.93.128 -n 20
    ipconfig /release
    ipconfig /renew

Problem 2: Promtail - YAML Syntax Error

Symptoms: "yaml: line X: mapping values are not allowed"

Solutions:

  1. Check Indentation: Use spaces, NOT tabs (2 spaces per level)
  2. Verify Path Escaping:
    # ✅ Correct
    filename: C:\\promtail\\positions.yaml
    
    # ❌ Wrong
    filename: C:\promtail\positions.yaml
  3. Test Config:
    promtail-windows-amd64.exe -config.file=promtail-config.yaml --dry-run
  4. Re-create Config: Save as UTF-8 encoding

Problem 3: Loki - Container Mount Error (SELinux)

Symptoms: "Permission denied" saat docker compose up

Solutions:

# Fix SELinux Context
sudo chcon -Rt svirt_sandbox_file_t ~/loki-stack/

# Set SELinux Booleans
sudo setsebool -P container_manage_cgroup true
sudo setsebool -P httpd_can_network_connect 1

# Check SELinux Status
getenforce  # Should be "Enforcing"

Problem 4: Connection Refused (Windows → Loki)

Symptoms: Promtail error: "connection refused"

Solutions:

  1. Verify Loki Running:
    # Di Fedora VM
    docker compose ps
    docker compose logs loki | grep "listening"
  2. Check Firewall:
    sudo firewall-cmd --list-ports  # Must include 3100/tcp
    sudo firewall-cmd --permanent --add-port=3100/tcp
    sudo firewall-cmd --reload
  3. Test with PowerShell:
    Test-NetConnection -ComputerName 192.168.93.128 -Port 3100

Problem 5: Promtail - No Events Scraped

Symptoms: Promtail running, tapi "scraped 0 events"

Solutions:

  1. Delete Bookmark Files:
    cd C:\promtail
    del bookmark-*.xml
    del positions.yaml
  2. Run as Administrator: Right-click Command Prompt → Run as Administrator
  3. Check Event Viewer Access:
    Get-EventLog -LogName System -Newest 5
    Get-EventLog -LogName Application -Newest 5
  4. Generate Test Events:
    ping localhost -n 50
    ipconfig /all

Tips Optimasi

1. Performance Tuning

Loki Configuration

# loki-local-config.yaml - Optimized
limits_config:
  split_queries_by_interval: 15m
  max_query_parallelism: 32
  max_entries_limit_per_query: 5000
  max_cache_freshness_per_query: 10m

query_scheduler:
  max_outstanding_requests_per_tenant: 256

frontend:
  max_outstanding_per_tenant: 256
  compress_responses: true

Promtail Batch Settings

# promtail-config.yaml - Optimized
clients:
  - url: http://192.168.93.128:3100/loki/api/v1/push
    batchwait: 3s      # Faster push
    batchsize: 204800  # 200KB batch
    timeout: 30s
    backoff_config:
      min_period: 500ms
      max_period: 5m
      max_retries: 10

2. Advanced LogQL Queries

Query 1: Top 10 Error Sources

topk(10, 
  sum by (source) (
    count_over_time({job="windows-system"} |= "error" [1h])
  )
)

Query 2: Event Rate per Event ID

sum by (event_id) (
  rate({job="windows-system"} | json [5m])
)

Query 3: Pattern Detection

{job="windows-system"} 
  | pattern `<_>  <_> : `
  | level =~ "Error|Critical"

3. Alerting Setup

Create Alert Rule in Grafana:

  1. MenuAlertingAlert rules
  2. New alert rule

Alert Configuration:

Name: High Error Rate Windows

Query:
  sum(rate({job=~"windows-.*"} |= "error" [5m])) > 10

Conditions:
  - WHEN avg() OF query(A, 5m, now) IS ABOVE 10

Evaluate every: 1m
For: 5m

Labels:
  severity: warning
  team: infrastructure

Annotations:
  summary: High error rate detected on Windows
  description: Error rate is {{ $value }} errors/sec

4. Security Hardening

Enable Grafana HTTPS

# Generate self-signed cert
sudo openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout /etc/grafana/grafana.key \
  -out /etc/grafana/grafana.crt

# Edit Grafana config
sudo nano /etc/grafana/grafana.ini
[server]
protocol = https
http_port = 3000
cert_file = /etc/grafana/grafana.crt
cert_key = /etc/grafana/grafana.key

Firewall Restrictions

# Allow only Windows IP
sudo firewall-cmd --permanent --add-rich-rule='
  rule family="ipv4"
  source address="192.168.1.80"
  port port="3100" protocol="tcp" accept'
sudo firewall-cmd --reload

5. Backup & Recovery

Backup Loki Data

# Stop Loki
docker compose stop loki

# Backup volume
sudo tar -czf ~/loki-backup-$(date +%Y%m%d).tar.gz \
  ~/loki-stack/loki-local-config.yaml \
  /var/lib/docker/volumes/loki-stack_loki-storage

# Restart Loki
docker compose start loki

Restore Procedure

# Restore Loki data
sudo tar -xzf ~/loki-backup-20251003.tar.gz -C /

# Restart services
docker compose restart loki
sudo systemctl restart grafana-server

6. Monitoring the Monitor

Check Loki Metrics

# Ingestion rate
curl http://localhost:3100/metrics | grep loki_distributor_bytes_received_total

# Query performance
curl http://localhost:3100/metrics | grep loki_query_duration_seconds

Check Promtail Health

# Windows - Check Promtail targets
Invoke-WebRequest -Uri http://localhost:9080/targets

Kesimpulan

Setup ini memberikan solusi monitoring log Windows yang:

  • Scalable - Bisa handle multiple Windows hosts
  • Real-time - Log muncul dalam 10 detik di Grafana
  • Flexible - Query dengan LogQL yang powerful
  • Cost-effective - Semua tools open-source
  • Production-ready - Include backup, alert, security

Next Steps

  • Add more Windows hosts
  • Setup email alerting
  • Create custom dashboards per team
  • Implement log retention policy
  • Enable HTTPS for Grafana
  • Setup log aggregation dari aplikasi custom
  • Integrate dengan Prometheus untuk metrics
Kembali ke Atas