Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Singh, Satyam, Ramachandran, Sai Niranjan
Format:	Preprint
Published:	2026
Subjects:	Databases Software Engineering
Online Access:	https://arxiv.org/abs/2601.00633
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911350684385280
author	Singh, Satyam Ramachandran, Sai Niranjan
author_facet	Singh, Satyam Ramachandran, Sai Niranjan
contents	Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dynamism of production environments. Built on fundamentally static template models, they are dangerously brittle: minor schema drifts silently break parsing pipelines, leading to lost alerts and operational toil. We propose \textbf{KELP} (\textbf{K}elp \textbf{E}volutionary \textbf{L}og \textbf{P}arser), a high-throughput parser built on a novel data structure: the Evolutionary Grouping Tree. Unlike heuristic approaches that rely on fixed rules, KELP treats template discovery as a continuous online clustering process. As logs arrive, the tree structure evolves, nodes split, merge, and re-evaluate roots based on changing frequency distributions. Validating this adaptability requires a dataset that models realistic production complexity, yet we identify that standard benchmarks rely on static, regex-based ground truths that fail to reflect this. To enable rigorous evaluation, we introduce a new benchmark designed to reflect the structural ambiguity of modern production systems. Our evaluation demonstrates that KELP maintains high accuracy on this rigorous dataset where traditional heuristic methods fail, without compromising throughput. Our code and dataset can be found at codeberg.org/stonebucklabs/kelp
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_00633
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	KELP: Robust Online Log Parsing Through Evolutionary Grouping Trees Singh, Satyam Ramachandran, Sai Niranjan Databases Software Engineering Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dynamism of production environments. Built on fundamentally static template models, they are dangerously brittle: minor schema drifts silently break parsing pipelines, leading to lost alerts and operational toil. We propose \textbf{KELP} (\textbf{K}elp \textbf{E}volutionary \textbf{L}og \textbf{P}arser), a high-throughput parser built on a novel data structure: the Evolutionary Grouping Tree. Unlike heuristic approaches that rely on fixed rules, KELP treats template discovery as a continuous online clustering process. As logs arrive, the tree structure evolves, nodes split, merge, and re-evaluate roots based on changing frequency distributions. Validating this adaptability requires a dataset that models realistic production complexity, yet we identify that standard benchmarks rely on static, regex-based ground truths that fail to reflect this. To enable rigorous evaluation, we introduce a new benchmark designed to reflect the structural ambiguity of modern production systems. Our evaluation demonstrates that KELP maintains high accuracy on this rigorous dataset where traditional heuristic methods fail, without compromising throughput. Our code and dataset can be found at codeberg.org/stonebucklabs/kelp
title	KELP: Robust Online Log Parsing Through Evolutionary Grouping Trees
topic	Databases Software Engineering
url	https://arxiv.org/abs/2601.00633

Similar Items