Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autor principal:	Li, Y.Y.N.
Formato:	Recurso digital
Lenguaje:
Publicado:	Zenodo 2026
Acceso en línea:	https://doi.org/10.5281/zenodo.18930213
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866901175491624960
author	Li, Y.Y.N.
author_facet	Li, Y.Y.N.
contents	<p>Catastrophic forgetting arises when gradient updates for a new task overwrite parameter directions critical to a previously learned task. We argue that the information field tensor Gamma_info -- a curvature object derived from the entropy functional of the model's predictive distribution [Li 2026] -- provides a geometry-informed signal for continual learning: directions in the approximate null space of Gamma_info are information-neutral and potentially safe to update.</p> <p>We instantiate this view through an audit-gated gradient projection family. Rather than claiming exact full-parameter null-space recovery, we use Gamma cross-batch reproducibility audits: each parameter's Gamma_info sub-block is estimated on two disjoint batch halves, and only parameters whose near-null eigenspaces align across both halves (Criterion A > 0.5, passing in >= 2 independent subsets) enter gradient projection. A matched random-direction control (same support indices and subspace rank) isolates whether the audit-identified direction -- not merely the projection operation -- is the source of any forgetting benefit.</p> <p>In a cross-domain continual learning experiment (GPT-2, WikiText-2 -> Biomedical Medical QA), the audit gates 41-42 of 42 candidate parameters across 5 random seeds, demonstrating robust null-space structure throughout GPT-2 layers h.6-h.11. Audit-gated null projection (gamma_along) significantly reduces forgetting versus unconstrained fine-tuning (+331 +/- 30 vs. +414 +/- 45, Delta = -83, p < 0.05, 5 seeds), while preserving Task B perplexity (9.55 vs. 9.54 for free). The direction signal is directionally present: gamma_along < gamma_random (Delta = -38), supporting the geometric claim that audit-identified null directions -- not merely projection -- reduce forgetting.</p>
format	Recurso digital
id	zenodo_https___doi_org_10_5281_zenodo_18930213
institution	Zenodo
language
publishDate	2026
publisher	Zenodo
record_format	zenodo
spellingShingle	Audit-Gated Gradient Projection from Information-Curvature\\ Index Subspaces for Continual Learning Li, Y.Y.N. <p>Catastrophic forgetting arises when gradient updates for a new task overwrite parameter directions critical to a previously learned task. We argue that the information field tensor Gamma_info -- a curvature object derived from the entropy functional of the model's predictive distribution [Li 2026] -- provides a geometry-informed signal for continual learning: directions in the approximate null space of Gamma_info are information-neutral and potentially safe to update.</p> <p>We instantiate this view through an audit-gated gradient projection family. Rather than claiming exact full-parameter null-space recovery, we use Gamma cross-batch reproducibility audits: each parameter's Gamma_info sub-block is estimated on two disjoint batch halves, and only parameters whose near-null eigenspaces align across both halves (Criterion A > 0.5, passing in >= 2 independent subsets) enter gradient projection. A matched random-direction control (same support indices and subspace rank) isolates whether the audit-identified direction -- not merely the projection operation -- is the source of any forgetting benefit.</p> <p>In a cross-domain continual learning experiment (GPT-2, WikiText-2 -> Biomedical Medical QA), the audit gates 41-42 of 42 candidate parameters across 5 random seeds, demonstrating robust null-space structure throughout GPT-2 layers h.6-h.11. Audit-gated null projection (gamma_along) significantly reduces forgetting versus unconstrained fine-tuning (+331 +/- 30 vs. +414 +/- 45, Delta = -83, p < 0.05, 5 seeds), while preserving Task B perplexity (9.55 vs. 9.54 for free). The direction signal is directionally present: gamma_along < gamma_random (Delta = -38), supporting the geometric claim that audit-identified null directions -- not merely projection -- reduce forgetting.</p>
title	Audit-Gated Gradient Projection from Information-Curvature\\ Index Subspaces for Continual Learning
url	https://doi.org/10.5281/zenodo.18930213

Ejemplares similares