Saved in:
Bibliographic Details
Main Authors: Li, Yuantong, Wang, Chi-hua, Cheng, Guang
Format: Preprint
Published: 2020
Subjects:
Online Access:https://arxiv.org/abs/2012.01668
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911891303956480
author Li, Yuantong
Wang, Chi-hua
Cheng, Guang
author_facet Li, Yuantong
Wang, Chi-hua
Cheng, Guang
contents Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users' data are accessible only for a limited period of time. This setting is formulated as an online supervised learning task with \textit{constant memory limit}. We propose a deletion-aware algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a catastrophic rank swinging phenomenon due to the data deletion operation, which leads to statistical inefficiency. As a remedy, we propose the \texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization scheme, that effectively offsets the uncertainty from deletion. In theory, we provide the cumulative regret upper bound for both online forgetting algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge} outperforms the ridge regression algorithm with fixed regularization level, and hopefully sheds some light on more complex statistical models.
format Preprint
id arxiv_https___arxiv_org_abs_2012_01668
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle Online Forgetting Process for Linear Regression Models
Li, Yuantong
Wang, Chi-hua
Cheng, Guang
Machine Learning
Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems where users' data are accessible only for a limited period of time. This setting is formulated as an online supervised learning task with \textit{constant memory limit}. We propose a deletion-aware algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a catastrophic rank swinging phenomenon due to the data deletion operation, which leads to statistical inefficiency. As a remedy, we propose the \texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization scheme, that effectively offsets the uncertainty from deletion. In theory, we provide the cumulative regret upper bound for both online forgetting algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge} outperforms the ridge regression algorithm with fixed regularization level, and hopefully sheds some light on more complex statistical models.
title Online Forgetting Process for Linear Regression Models
topic Machine Learning
url https://arxiv.org/abs/2012.01668