Enregistré dans:
Détails bibliographiques
Auteurs principaux: Bilyk, Dmitriy, Steinerberger, Stefan
Format: Preprint
Publié: 2025
Sujets:
Accès en ligne:https://arxiv.org/abs/2501.13813
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866929684999045120
author Bilyk, Dmitriy
Steinerberger, Stefan
author_facet Bilyk, Dmitriy
Steinerberger, Stefan
contents It is well understood that if one is given a set $X \subset [0,1]$ of $n$ independent uniformly distributed random variables, then $$ \sup_{0 \leq x \leq 1} \left| \frac{\# X \cap [0,x]}{\# X} - x \right| \lesssim \frac{\sqrt{\log{n}}}{ \sqrt{n}} \qquad \mbox{with very high probability.} $$ We show that one can improve the error term by removing a few of the points. For any $m \leq 0.001n$ there exists a subset $Y \subset X$ obtained by deleting at most $m$ points, so that the error term drops from $\sim \sqrt{\log{n}}/\sqrt{n}$ to $ \log{(n)}/m$ with high probability. When $m=cn$ for a small $0 \leq c \leq 0.001$, this achieves the essentially optimal asymptotic order of discrepancy $\log(n)/n$. The proof is constructive and works in an online setting (where one is given the points sequentially, one at a time, and has to decide whether to keep or discard it). A change of variables shows the same result for any random variables on the real line with absolutely continuous density.
format Preprint
id arxiv_https___arxiv_org_abs_2501_13813
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Regularizing random points by deleting a few
Bilyk, Dmitriy
Steinerberger, Stefan
Probability
It is well understood that if one is given a set $X \subset [0,1]$ of $n$ independent uniformly distributed random variables, then $$ \sup_{0 \leq x \leq 1} \left| \frac{\# X \cap [0,x]}{\# X} - x \right| \lesssim \frac{\sqrt{\log{n}}}{ \sqrt{n}} \qquad \mbox{with very high probability.} $$ We show that one can improve the error term by removing a few of the points. For any $m \leq 0.001n$ there exists a subset $Y \subset X$ obtained by deleting at most $m$ points, so that the error term drops from $\sim \sqrt{\log{n}}/\sqrt{n}$ to $ \log{(n)}/m$ with high probability. When $m=cn$ for a small $0 \leq c \leq 0.001$, this achieves the essentially optimal asymptotic order of discrepancy $\log(n)/n$. The proof is constructive and works in an online setting (where one is given the points sequentially, one at a time, and has to decide whether to keep or discard it). A change of variables shows the same result for any random variables on the real line with absolutely continuous density.
title Regularizing random points by deleting a few
topic Probability
url https://arxiv.org/abs/2501.13813