Guardado en:
Detalles Bibliográficos
Autores principales: Prada, Benjamin, Matsumoto, Shion, Zekri, Abdul Malik, Mali, Ankur
Formato: Preprint
Publicado: 2025
Materias:
Acceso en línea:https://arxiv.org/abs/2505.14635
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866916845888471040
author Prada, Benjamin
Matsumoto, Shion
Zekri, Abdul Malik
Mali, Ankur
author_facet Prada, Benjamin
Matsumoto, Shion
Zekri, Abdul Malik
Mali, Ankur
contents We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
format Preprint
id arxiv_https___arxiv_org_abs_2505_14635
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning
Prada, Benjamin
Matsumoto, Shion
Zekri, Abdul Malik
Mali, Ankur
Machine Learning
We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation.
title Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning
topic Machine Learning
url https://arxiv.org/abs/2505.14635