Saved in:
Bibliographic Details
Main Authors: Yu, Cong, Shi, Tuo, Weidlich, Matthias, Zhao, Bo
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.04872
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917065284124672
author Yu, Cong
Shi, Tuo
Weidlich, Matthias
Zhao, Bo
author_facet Yu, Cong
Shi, Tuo
Weidlich, Matthias
Zhao, Bo
contents The detection of sequential patterns in data is a basic functionality of modern data processing systems for complex event processing (CEP), OLAP, and retrieval-augmented generation (RAG). In practice, pattern matching is challenging, since common applications rely on a large set of patterns that shall be evaluated with tight latency bounds. At the same time, matching needs to maintain state, i.e., intermediate results, that grows exponentially in the input size. Hence, systems turn to best-effort processing, striving for maximal recall under a latency bound. Existing techniques, however, consider each pattern in isolation, neglecting the optimization potential induced by state sharing in pattern matching. In this paper, we present SHARP, a library that employs state reduction to achieve efficient best-effort pattern matching. To this end, SHARP incorporates state sharing between patterns through a new abstraction, coined pattern-sharing degree (PSD). At runtime, this abstraction facilitates the categorization and indexing of partial pattern matches. Based thereon, once a latency bound is exceeded, SHARP realizes best-effort processing by selecting a subset of partial matches for further processing in constant time. In experiments with real-world data, SHARP achieves a recall of 97%, 96% and 73% for pattern matching in CEP, OLAP, and RAG applications, under a bound of 50% of the average processing latency.
format Preprint
id arxiv_https___arxiv_org_abs_2507_04872
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle SHARP: Shared State Reduction for Efficient Matching of Sequential Patterns
Yu, Cong
Shi, Tuo
Weidlich, Matthias
Zhao, Bo
Databases
The detection of sequential patterns in data is a basic functionality of modern data processing systems for complex event processing (CEP), OLAP, and retrieval-augmented generation (RAG). In practice, pattern matching is challenging, since common applications rely on a large set of patterns that shall be evaluated with tight latency bounds. At the same time, matching needs to maintain state, i.e., intermediate results, that grows exponentially in the input size. Hence, systems turn to best-effort processing, striving for maximal recall under a latency bound. Existing techniques, however, consider each pattern in isolation, neglecting the optimization potential induced by state sharing in pattern matching. In this paper, we present SHARP, a library that employs state reduction to achieve efficient best-effort pattern matching. To this end, SHARP incorporates state sharing between patterns through a new abstraction, coined pattern-sharing degree (PSD). At runtime, this abstraction facilitates the categorization and indexing of partial pattern matches. Based thereon, once a latency bound is exceeded, SHARP realizes best-effort processing by selecting a subset of partial matches for further processing in constant time. In experiments with real-world data, SHARP achieves a recall of 97%, 96% and 73% for pattern matching in CEP, OLAP, and RAG applications, under a bound of 50% of the average processing latency.
title SHARP: Shared State Reduction for Efficient Matching of Sequential Patterns
topic Databases
url https://arxiv.org/abs/2507.04872