Saved in:
Bibliographic Details
Main Authors: Singh, Satyam, Ramachandran, Sai Niranjan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.00633
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911350684385280
author Singh, Satyam
Ramachandran, Sai Niranjan
author_facet Singh, Satyam
Ramachandran, Sai Niranjan
contents Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dynamism of production environments. Built on fundamentally static template models, they are dangerously brittle: minor schema drifts silently break parsing pipelines, leading to lost alerts and operational toil. We propose \textbf{KELP} (\textbf{K}elp \textbf{E}volutionary \textbf{L}og \textbf{P}arser), a high-throughput parser built on a novel data structure: the Evolutionary Grouping Tree. Unlike heuristic approaches that rely on fixed rules, KELP treats template discovery as a continuous online clustering process. As logs arrive, the tree structure evolves, nodes split, merge, and re-evaluate roots based on changing frequency distributions. Validating this adaptability requires a dataset that models realistic production complexity, yet we identify that standard benchmarks rely on static, regex-based ground truths that fail to reflect this. To enable rigorous evaluation, we introduce a new benchmark designed to reflect the structural ambiguity of modern production systems. Our evaluation demonstrates that KELP maintains high accuracy on this rigorous dataset where traditional heuristic methods fail, without compromising throughput. Our code and dataset can be found at codeberg.org/stonebucklabs/kelp
format Preprint
id arxiv_https___arxiv_org_abs_2601_00633
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle KELP: Robust Online Log Parsing Through Evolutionary Grouping Trees
Singh, Satyam
Ramachandran, Sai Niranjan
Databases
Software Engineering
Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dynamism of production environments. Built on fundamentally static template models, they are dangerously brittle: minor schema drifts silently break parsing pipelines, leading to lost alerts and operational toil. We propose \textbf{KELP} (\textbf{K}elp \textbf{E}volutionary \textbf{L}og \textbf{P}arser), a high-throughput parser built on a novel data structure: the Evolutionary Grouping Tree. Unlike heuristic approaches that rely on fixed rules, KELP treats template discovery as a continuous online clustering process. As logs arrive, the tree structure evolves, nodes split, merge, and re-evaluate roots based on changing frequency distributions. Validating this adaptability requires a dataset that models realistic production complexity, yet we identify that standard benchmarks rely on static, regex-based ground truths that fail to reflect this. To enable rigorous evaluation, we introduce a new benchmark designed to reflect the structural ambiguity of modern production systems. Our evaluation demonstrates that KELP maintains high accuracy on this rigorous dataset where traditional heuristic methods fail, without compromising throughput. Our code and dataset can be found at codeberg.org/stonebucklabs/kelp
title KELP: Robust Online Log Parsing Through Evolutionary Grouping Trees
topic Databases
Software Engineering
url https://arxiv.org/abs/2601.00633