Saved in:
Bibliographic Details
Main Authors: Matsumoto, Shion, Castillo, Raul, Prada, Benjamin, Mali, Ankur Arjun
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.03405
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912941416120320
author Matsumoto, Shion
Castillo, Raul
Prada, Benjamin
Mali, Ankur Arjun
author_facet Matsumoto, Shion
Castillo, Raul
Prada, Benjamin
Mali, Ankur Arjun
contents The forward and reverse Kullback-Leibler (KL) divergences arise as limiting objectives in learning and inference yet induce markedly different inductive biases that cannot be explained at the level of expectations alone. In this work, we introduce the Surprisal-Rényi Free Energy (SRFE), a log-moment-based functional of the likelihood ratio that lies outside the class of $f$-divergences. We show that SRFE recovers forward and reverse KL divergences as singular endpoint limits and derive local expansions around both limits in which the variance of the log-likelihood ratio appears as a first-order correction. This reveals an explicit mean-variance tradeoff governing departures from KL-dominated regimes. We further establish a Gibbs-type variational characterization of SRFE as the unique minimizer of a weighted sum of KL divergences and prove that SRFE directly controls large deviations of excess code-length via Chernoff-type bounds, yielding a precise Minimum Description Length interpretation. Together, these results identify SRFE as a variance- and tail-sensitive free-energy functional that clarifies the geometric and large-deviation structure underlying forward and reverse KL limits, without unifying or subsuming distinct learning frameworks.
format Preprint
id arxiv_https___arxiv_org_abs_2603_03405
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Surprisal-Rényi Free Energy
Matsumoto, Shion
Castillo, Raul
Prada, Benjamin
Mali, Ankur Arjun
Machine Learning
The forward and reverse Kullback-Leibler (KL) divergences arise as limiting objectives in learning and inference yet induce markedly different inductive biases that cannot be explained at the level of expectations alone. In this work, we introduce the Surprisal-Rényi Free Energy (SRFE), a log-moment-based functional of the likelihood ratio that lies outside the class of $f$-divergences. We show that SRFE recovers forward and reverse KL divergences as singular endpoint limits and derive local expansions around both limits in which the variance of the log-likelihood ratio appears as a first-order correction. This reveals an explicit mean-variance tradeoff governing departures from KL-dominated regimes. We further establish a Gibbs-type variational characterization of SRFE as the unique minimizer of a weighted sum of KL divergences and prove that SRFE directly controls large deviations of excess code-length via Chernoff-type bounds, yielding a precise Minimum Description Length interpretation. Together, these results identify SRFE as a variance- and tail-sensitive free-energy functional that clarifies the geometric and large-deviation structure underlying forward and reverse KL limits, without unifying or subsuming distinct learning frameworks.
title Surprisal-Rényi Free Energy
topic Machine Learning
url https://arxiv.org/abs/2603.03405