Tallennettuna:
| Päätekijät: | , |
|---|---|
| Aineistotyyppi: | Recurso digital |
| Kieli: | englanti |
| Julkaistu: |
Zenodo
2026
|
| Aiheet: | |
| Linkit: | https://doi.org/10.5281/zenodo.18760198 |
| Tagit: |
Lisää tagi
Ei tageja, Lisää ensimmäinen tagi!
|
Sisällysluettelo:
- <p>Recent work by Radhakrishnan et al. (Science, 2026; doi:10.1126/science.aea6792) demonstrated that large language models encode abstract concepts — from persona types to normative dispositions — as identifiable and steerable directions in latent representation space. Their Recursive Feature Machine (RFM) framework successfully extracted and modulated over 500 concept directions across multiple categories.</p> <p>The present work addresses a complementary structural question: why do these particular concept directions form and stabilize as structured features of latent space geometry?</p> <p>We propose a structural hypothesis: concept directions arise as compression equilibria of persistent representational tensions embedded in human-generated training data. These tensions — recurring structural oppositions such as normative dichotomies, authority-alignment patterns, and identity-stabilization dynamics — are not incidental biases. They are conditionally persistent outcomes of shared data-generating conditions. During training, predictive loss minimization under encoding cost constraints produces stabilized basin-like geometries. The global configuration of these equilibria constitutes what we term the Phase Potential Landscape — governed not by physical energy but by encoding cost gradients within representational compression dynamics.</p> <p>The hypothesis yields four empirically testable predictions:</p> <p>Differential Perturbation Stability — Directions rooted in long-standing representational tensions form basins with steeper curvature, resulting in non-linear resistance to sustained steering perturbations.</p> <p>Cross-Model Directional Recurrence — Independently trained models on overlapping corpora should exhibit structurally similar latent geometries, reflecting shared generative constraints rather than architectural identity.</p> <p>Coupled Directional Dynamics — Deeply stabilized concept basins may share topological boundary regions, such that perturbing one direction induces correlated shifts in adjacent directions.</p> <p>Structural Persistence Across Architectures — Underlying directional stabilization patterns should survive architectural variation, reflecting invariant data-level tensions rather than parameterization artifacts.</p> <p>Repository contents:</p> <p>• Phase_Potential_V1_Main.pdf — Structural hypothesis and qualitative predictions, serving as the primary self-contained reference.</p> <p>• Phase_Potential_V1_Geometric.pdf — Topological framing of basin stabilization, emphasizing encoding cost gradients and landscape curvature as formation mechanisms.</p> <p>• Phase_Potential_V1_Defensive.pdf — Defensive academic structure with formal extension notes, connecting the hypothesis to optimization dynamics and stability analysis.</p> <p>•Formal_Foundations_of_the_Phase_Potential_Landscape.pdf — Mathematical framework introducing information geometry, renormalization group (RG) flow, and logarithmic cost scaling. Published for timestamp purposes; refinement is ongoing.</p> <p>The main document is designed to be self-contained. Additional perspectives provide structured entry points for readers from different analytical backgrounds. The formal foundations document offers preliminary mathematical development for future work.</p> <p>Developed through multi-model collaborative refinement.</p> <p>Related Identifiers</p> <p>References: doi:10.1126/science.aea6792 (Radhakrishnan et al., "Mapping and Manipulating Concepts in Large Language Models," Science, 2026)</p> <p>References: doi:10.5281/zenodo.18745759 (Companion technical document — phase potential modeling)</p> <p>License</p> <p>Creative Commons Attribution 4.0 International (CC-BY-4.0)</p>