Enregistré dans:
| Auteurs principaux: | , |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2501.13813 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
| _version_ | 1866929684999045120 |
|---|---|
| author | Bilyk, Dmitriy Steinerberger, Stefan |
| author_facet | Bilyk, Dmitriy Steinerberger, Stefan |
| contents | It is well understood that if one is given a set $X \subset [0,1]$ of $n$ independent uniformly distributed random variables, then $$ \sup_{0 \leq x \leq 1} \left| \frac{\# X \cap [0,x]}{\# X} - x \right| \lesssim \frac{\sqrt{\log{n}}}{ \sqrt{n}} \qquad \mbox{with very high probability.} $$ We show that one can improve the error term by removing a few of the points. For any $m \leq 0.001n$ there exists a subset $Y \subset X$ obtained by deleting at most $m$ points, so that the error term drops from $\sim \sqrt{\log{n}}/\sqrt{n}$ to $ \log{(n)}/m$ with high probability. When $m=cn$ for a small $0 \leq c \leq 0.001$, this achieves the essentially optimal asymptotic order of discrepancy $\log(n)/n$. The proof is constructive and works in an online setting (where one is given the points sequentially, one at a time, and has to decide whether to keep or discard it). A change of variables shows the same result for any random variables on the real line with absolutely continuous density. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2501_13813 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Regularizing random points by deleting a few Bilyk, Dmitriy Steinerberger, Stefan Probability It is well understood that if one is given a set $X \subset [0,1]$ of $n$ independent uniformly distributed random variables, then $$ \sup_{0 \leq x \leq 1} \left| \frac{\# X \cap [0,x]}{\# X} - x \right| \lesssim \frac{\sqrt{\log{n}}}{ \sqrt{n}} \qquad \mbox{with very high probability.} $$ We show that one can improve the error term by removing a few of the points. For any $m \leq 0.001n$ there exists a subset $Y \subset X$ obtained by deleting at most $m$ points, so that the error term drops from $\sim \sqrt{\log{n}}/\sqrt{n}$ to $ \log{(n)}/m$ with high probability. When $m=cn$ for a small $0 \leq c \leq 0.001$, this achieves the essentially optimal asymptotic order of discrepancy $\log(n)/n$. The proof is constructive and works in an online setting (where one is given the points sequentially, one at a time, and has to decide whether to keep or discard it). A change of variables shows the same result for any random variables on the real line with absolutely continuous density. |
| title | Regularizing random points by deleting a few |
| topic | Probability |
| url | https://arxiv.org/abs/2501.13813 |