Enregistré dans:
| Auteur principal: | |
|---|---|
| Format: | Recurso digital |
| Langue: | anglais |
| Publié: |
Zenodo
2026
|
| Sujets: | |
| Accès en ligne: | https://doi.org/10.5281/zenodo.18975350 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Table des matières:
- <p>This paper shows that the recurrent update used in linear attention models is mathematically equivalent to Bayesian inference under a Dependent Dirichlet Process. The forgetting rate, the input-dependent gate, and the multi-head structure each receive a clean probabilistic interpretation. The result is exact and holds by construction, not as an approximation.</p>