Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kathiriya, Niket, Haeri, Hossein, Chen, Cindy, Jerath, Kshitij
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Databases
Online Access:	https://arxiv.org/abs/2403.09588
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913265084268544
author	Kathiriya, Niket Haeri, Hossein Chen, Cindy Jerath, Kshitij
author_facet	Kathiriya, Niket Haeri, Hossein Chen, Cindy Jerath, Kshitij
contents	Many modern systems, such as financial, transportation, and telecommunications systems, are time-sensitive in the sense that they demand low-latency predictions for real-time decision-making. Such systems often have to contend with continuous unbounded data streams as well as concept drift, which are challenging requirements that traditional regression techniques are unable to cater to. There exists a need to create novel data stream regression methods that can handle these scenarios. We present a database-inspired datastream regression model that (a) uses inspiration from R-trees to create granules from incoming datastreams such that relevant information is retained, (b) iteratively forgets granules whose information is deemed to be outdated, thus maintaining a list of only recent, relevant granules, and (c) uses the recent data and granules to provide low-latency predictions. The R-tree-inspired approach also makes the algorithm amenable to integration with database systems. Our experiments demonstrate that the ability of this method to discard data produces a significant order-of-magnitude improvement in latency and training time when evaluated against the most accurate state-of-the-art algorithms, while the R*-tree-inspired granulation technique provides competitively accurate predictions
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_09588
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Iterative Forgetting: Online Data Stream Regression Using Database-Inspired Adaptive Granulation Kathiriya, Niket Haeri, Hossein Chen, Cindy Jerath, Kshitij Machine Learning Databases Many modern systems, such as financial, transportation, and telecommunications systems, are time-sensitive in the sense that they demand low-latency predictions for real-time decision-making. Such systems often have to contend with continuous unbounded data streams as well as concept drift, which are challenging requirements that traditional regression techniques are unable to cater to. There exists a need to create novel data stream regression methods that can handle these scenarios. We present a database-inspired datastream regression model that (a) uses inspiration from R-trees to create granules from incoming datastreams such that relevant information is retained, (b) iteratively forgets granules whose information is deemed to be outdated, thus maintaining a list of only recent, relevant granules, and (c) uses the recent data and granules to provide low-latency predictions. The R-tree-inspired approach also makes the algorithm amenable to integration with database systems. Our experiments demonstrate that the ability of this method to discard data produces a significant order-of-magnitude improvement in latency and training time when evaluated against the most accurate state-of-the-art algorithms, while the R*-tree-inspired granulation technique provides competitively accurate predictions
title	Iterative Forgetting: Online Data Stream Regression Using Database-Inspired Adaptive Granulation
topic	Machine Learning Databases
url	https://arxiv.org/abs/2403.09588

Similar Items