Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shao, Jui-Hsiang, Wang, Hsin-Po
Format:	Preprint
Published:	2025
Subjects:	Information Theory
Online Access:	https://arxiv.org/abs/2507.00915
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912569919275008
author	Shao, Jui-Hsiang Wang, Hsin-Po
author_facet	Shao, Jui-Hsiang Wang, Hsin-Po
contents	Simulating an arbitrary discrete distribution $D \in [0, 1]^n$ using fair coin tosses incurs trade-offs between entropy complexity and space and time complexity. Shannon's theory suggests that $H(D)$ tosses are necessary and sufficient, but does not guarantee exact distribution. Knuth and Yao showed that a decision tree consumes fewer than $H(D) + 2$ tosses for one exact sample. Draper and Saad's recent work addresses the space and time aspect, showing that $H(D) + 2$ tosses, $O(n \log(n) \log(m))$ memory, and $O(H(D))$ operations are all it costs, where $m$ is the common denominator of the probability masses in $D$ and $n$ is the number of possible outcomes. In this paper, MichelangeRoll recycles leftover entropy to break the "$+2$" barrier. With $O((n + 1/\varepsilon) \log(m/\varepsilon))$ memory, the entropy cost of generating a ongoing sequence of $D$ is reduced to $H(D) + \varepsilon$ per sample.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_00915
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	MichelangeRoll: Sculpting Rational Distributions Exactly and Efficiently Shao, Jui-Hsiang Wang, Hsin-Po Information Theory Simulating an arbitrary discrete distribution $D \in [0, 1]^n$ using fair coin tosses incurs trade-offs between entropy complexity and space and time complexity. Shannon's theory suggests that $H(D)$ tosses are necessary and sufficient, but does not guarantee exact distribution. Knuth and Yao showed that a decision tree consumes fewer than $H(D) + 2$ tosses for one exact sample. Draper and Saad's recent work addresses the space and time aspect, showing that $H(D) + 2$ tosses, $O(n \log(n) \log(m))$ memory, and $O(H(D))$ operations are all it costs, where $m$ is the common denominator of the probability masses in $D$ and $n$ is the number of possible outcomes. In this paper, MichelangeRoll recycles leftover entropy to break the "$+2$" barrier. With $O((n + 1/\varepsilon) \log(m/\varepsilon))$ memory, the entropy cost of generating a ongoing sequence of $D$ is reduced to $H(D) + \varepsilon$ per sample.
title	MichelangeRoll: Sculpting Rational Distributions Exactly and Efficiently
topic	Information Theory
url	https://arxiv.org/abs/2507.00915

Similar Items