Real-time Log Collection and Analysis Case Study

A case study on building a monitoring system that collects API Gateway logs in real-time using Kafka-based streaming architecture, stores them in StarRocks, and visualizes them with Apache Superset. This achieves improved service stability and reduced incident response time.

Real-time Log Collection and Analysis Case Study

Table of Contents

  1. Overview
  2. Log Collection
  3. Log Data Sink
  4. Visualization
  5. Conclusion

1. Overview

This document covers a case study of building a system for real-time collection and analysis of PASSUP DIP's API Gateway logs. Using a Kafka-based message streaming architecture, log data is stored in a Target DB and the service status can be monitored in real-time through BI tools.

Implementation Environment

  • Message Broker: kafka-cluster
  • Log Producer: fluentd
  • Kafka Connector: StarRocks Sink Connector
  • Target DB: StarRocks
  • Kafka Monitoring: Kafbat UI
  • BI Tool: Apache Superset

Real-time Log Collection Architecture

스크린샷 2025-12-12 000616.png


2. Log Collection

API Gateway Log Message Format Configuration

You can define the log message format in the values.yaml file when deploying the API Gateway.

values.yaml Configuration Example

env:
  proxy_access_log: "/dev/stdout custom_fmt" 
  # Custom log format definition
  nginx_http_log_format: >-
    custom_fmt '$remote_addr [$time_local] "$request" $status $request_time $upstream_response_time "$http_host" "$http_user_agent"'
  real_ip_header: "X-Forwarded-For"
  real_ip_recursive: "on"

Log Format Variable Descriptions

Variable Name Meaning Detailed Description
$remote_addr Client IP Actual Client IP that sent the request. real_ip configuration needed when Proxy or LB is present
$time_local Request time (local timezone) Server local time in [10/Dec/2025:10:23:10 +0900] format
$request Request line Complete request string in "GET /api/v1/foo HTTP/1.1" format
$status HTTP response code Status code returned to client such as 200, 404, 500
$request_time Total request processing time Total processing time from request reception to response completion (in seconds)
$upstream_response_time Upstream response time Backend service response time (in seconds), used for backend delay analysis
$http_host Original Host header Host value from client request, used to identify Ingress domain
$http_user_agent User-Agent UA string identifying browser/CLI/bot, utilized for security and debugging

API Gateway Log Collection and Kafka Transmission Process

In PAASUP DIP, logs are collected through the following procedure and messages are sent to the Kafka logging.kong topic.

  1. Kong application generates container logs
  2. Fluent-bit DaemonSet collects logs from all nodes
  3. Flow CR filters only logs with Kong labels
  4. Fluentd StatefulSet processes logs into JSON format
  5. Transmission to Kafka topic logging.kong through ClusterOutput
  6. Secure communication with Kafka using SCRAM-SHA-512 authentication and TLS

Note: The above process is automatically configured as the platform's default logging pipeline.

Log Collection Verification

You can view JSON messages from the logging.kong topic in real-time through Kafbat UI.

스크린샷 2025-12-12 001255.png


3. Log Data Sink

Target Table Creation

Create a table in the Target DB (StarRocks) matching the JSON message structure of the logging.kong topic.

Key Considerations:

  • The time column is in UTC timezone, so add a kst_time computed column to convert to Korean time
  • kubernetes, kubernetes_namespace are nested JSON structures, so declare them as JSON data type
USE quickstart;

CREATE TABLE IF NOT EXISTS kong_log_events (
    time DATETIME,
    stream STRING,
    logtag STRING,
    message STRING,
    kubernetes JSON,
    kubernetes_namespace JSON,
    kst_time DATETIME AS convert_tz(time, 'UTC', 'Asia/Seoul') 
) ENGINE = OLAP
DUPLICATE KEY(time)
DISTRIBUTED BY HASH(kst_time) BUCKETS 10;

StarRocks Sink Connector Creation and Data Loading

You can easily create a Kafka Connector from the DIP catalog creation menu.

Creation Procedure:

  1. Click Create kafka-connector in the Catalog Creation menu
  2. Select StarRocks Sink(Json) from Connector types
  3. Enter required information:
    • Topic name
    • StarRocks Namespace
    • StarRocks Database
    • StarRocks Username
    • StarRocks Password
    • Topic to table mapping information
  4. Click Create button

스크린샷 2025-12-15 105724.png

Immediately after Connector creation, Kafka messages are consumed and data is loaded in real-time into StarRocks' kong_log_events table.

Data Loading Verification

You can query the loaded data from SQL Client.

SELECT * FROM quickstart.kong_log_events;   

스크린샷 2025-12-12 001709.png


4. Visualization

Step 1: Superset Dataset Creation

Since raw log data is stored as strings in the message column, regular expression functions must be used to extract information needed for analysis. Repeating complex regex queries for each chart causes performance degradation and management difficulties, so we efficiently manage using Virtual Dataset.

Dataset Creation Method:

  1. Execute the query below in SQL Lab
  2. Click Save as Dataset
  3. Dataset name: kong_parsed_logs
  4. Use this Dataset as the data source for all charts
SELECT 
    kst_time,
    split_part(message, ' ', 1) AS client_ip,
    regexp_extract(message, 'HTTP/[0-9.]+" [0-9]+ ([0-9.]+)', 1) AS response_time,
    regexp_extract(message, 'HTTP/[0-9.]+" [0-9]+ [-0-9. ]+ "(.*?)"', 1) AS host_domain,
    CASE 
        WHEN regexp_extract(message, 'HTTP/[0-9.]+" [0-9]+ [-0-9. ]+ "(.*?)"', 1) LIKE '%-%' 
        THEN split_part(regexp_extract(message, 'HTTP/[0-9.]+" [0-9]+ [-0-9. ]+ "(.*?)"', 1), '-', 1)
        ELSE 'platform' 
    END AS project_name,
    split_part(regexp_extract(message, '"(GET|POST|PUT|DELETE|HEAD|OPTIONS) (.*?) HTTP', 2), '?', 1) AS request_url, 
    regexp_extract(message, 'HTTP/[0-9.]+" ([0-9]{3})', 1) AS status_code,
    message
FROM 
    quickstart.kong_log_events
WHERE 
    stream = 'stdout'
    AND message LIKE '%HTTP/%' -- Filter Access Log only
    AND message REGEXP '^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}'  -- Filter logs starting with IP only
    AND split_part(message, ' ', 1) != '127.0.0.1'  -- Exclude Kong health check
    -- Inject dynamic filter using Jinja Template
    {% if from_dttm %}
    AND kst_time >= '{{ from_dttm }}'
    {% endif %}
    {% if to_dttm %}
    AND kst_time < '{{ to_dttm }}'
    {% endif %}

Step 2: Dashboard Chart Configuration

스크린샷 2025-12-12 001835.png

Create various monitoring charts using the kong_parsed_logs Dataset.

1) Response Time Trend

  • Chart Type: Time-series Line Chart
  • Time Column: kst_time
  • Metrics: AVG(response_time), MAX(response_time)
  • Description: Monitor response time trends in real-time

2) Requests Trend

  • Chart Type: Big Number
  • Metric: COUNT(*)
  • Time Grain: HOUR
  • Description: Total request count trend per hour

3) RPS by Status Code

  • Chart Type: Time-series Chart (Stacked Area)
  • Time Column: kst_time
  • Dimensions: status_code
  • Metrics: COUNT(*)
  • Description: Visualize request distribution by status code

4) Error Rate (%)

  • Chart Type: Big Number with Trendline
  • Custom Metric:
    SUM(CASE WHEN status_code >= 400 THEN 1 ELSE 0 END) * 100.0 / COUNT(*)
    
  • Time Grain: HOUR
  • Description: Monitor error rate (%) trend

5) Traffic Share by Project

  • Chart Type: Pie Chart
  • Dimensions: project_name
  • Metric: COUNT(*)
  • Description: Analyze traffic share by project

6) Top 10 Slowest Request URL

  • Chart Type: Table
  • Dimensions: concat(host_domain, request_url)
  • Metrics: AVG(response_time), COUNT(*)
  • Sort By: AVG(response_time) Descending
  • Row Limit: 10
  • Description: Identify request URLs with slowest average response time

7) Top 10 Errors by Request URL

  • Chart Type: Table
  • Dimensions: concat(host_domain, request_url)
  • Metrics:
    • COUNT(*) (Total Requests)
    • Custom Metric: SUM(CASE WHEN status_code >= 400 THEN 1 ELSE 0 END) (Error Count)
  • Sort By: Error Count Descending
  • Row Limit: 10
  • Description: Identify URLs with most errors

8) Error List

  • Query Mode: RAW RECORDS
  • Filters: status_code >= 400
  • Columns: kst_time, status_code, response_time, message
  • Ordering: kst_time DESC
  • Description: View detailed messages of recent error logs

Step 3: Dashboard Filter Application (Native Filters)

Add a Time Range Filter to dynamically control the entire dashboard.

  • Filter Type: Time Range
  • Default Value: Current day
  • Scope: All charts

Performance Optimization Tips

In large-scale traffic environments (thousands of requests per second or more), querying the raw table (quickstart.kong_log_events) directly can slow down dashboard response time.

Recommended Optimization Methods:

  • Use StarRocks' Materialized View to create 10-minute aggregation tables
  • Connect aggregation tables as Superset Datasets
  • Maintain real-time nature while significantly improving query performance

5. Conclusion

This document constructed a complete pipeline that collects API Gateway logs in real-time using Kafka-based streaming architecture, stores them in StarRocks, and visualizes them with Apache Superset.

Key Achievements

  • Real-time Monitoring: Real-time identification of API Gateway request patterns, response times, error rates, etc.
  • Reduced Incident Response Time: Fast root cause analysis through real-time detailed logs when errors occur
  • Data-driven Decision Making: Resource optimization and capacity planning through project-specific traffic analysis
  • Scalable Architecture: Stable performance even for high-volume log processing through combination of Kafka and StarRocks

Future Improvement Directions

  • Machine Learning-based Anomaly Detection: Automatic detection of abnormal traffic compared to normal patterns
  • Multi-cluster Support: Integrated monitoring of logs from multiple Kubernetes clusters

This real-time log analysis system greatly contributes to service stability improvement and operational efficiency, and can establish itself as core infrastructure for DevOps culture adoption.

Subscribe to PAASUP IDEAS

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe