Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Prada, Benjamin, Matsumoto, Shion, Zekri, Abdul Malik, Mali, Ankur
Formato:	Preprint
Publicado:	2025
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2505.14635
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866916845888471040
author	Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur
author_facet	Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur
contents	We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_14635
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur Machine Learning We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
title	Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning
topic	Machine Learning
url	https://arxiv.org/abs/2505.14635

Ejemplares similares