Guardado en:
| Autores principales: | , , , |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2505.14635 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| _version_ | 1866916845888471040 |
|---|---|
| author | Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur |
| author_facet | Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur |
| contents | We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2505_14635 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning Prada, Benjamin Matsumoto, Shion Zekri, Abdul Malik Mali, Ankur Machine Learning We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation. |
| title | Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2505.14635 |