Saved in:
Bibliographic Details
Main Authors: Lei, Ge, Cooper, Samuel J.
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.10871
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912491923046400
author Lei, Ge
Cooper, Samuel J.
author_facet Lei, Ge
Cooper, Samuel J.
contents This study explores how large language models (LLMs) encode interwoven scientific knowledge, using chemical elements and LLaMA-series models as a case study. We identify a 3D spiral structure in the hidden states that aligns with the conceptual structure of the periodic table, suggesting that LLMs can reflect the geometric organization of scientific concepts learned from text. Linear probing reveals that middle layers encode continuous, overlapping attributes that enable indirect recall, while deeper layers sharpen categorical distinctions and incorporate linguistic context. These findings suggest that LLMs represent symbolic knowledge not as isolated facts, but as structured geometric manifolds that intertwine semantic information across layers. We hope this work inspires further exploration of how LLMs represent and reason about scientific knowledge, particularly in domains such as materials science.
format Preprint
id arxiv_https___arxiv_org_abs_2502_10871
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Layerwise Recall and the Geometry of Interwoven Knowledge in LLMs
Lei, Ge
Cooper, Samuel J.
Computation and Language
Artificial Intelligence
Machine Learning
This study explores how large language models (LLMs) encode interwoven scientific knowledge, using chemical elements and LLaMA-series models as a case study. We identify a 3D spiral structure in the hidden states that aligns with the conceptual structure of the periodic table, suggesting that LLMs can reflect the geometric organization of scientific concepts learned from text. Linear probing reveals that middle layers encode continuous, overlapping attributes that enable indirect recall, while deeper layers sharpen categorical distinctions and incorporate linguistic context. These findings suggest that LLMs represent symbolic knowledge not as isolated facts, but as structured geometric manifolds that intertwine semantic information across layers. We hope this work inspires further exploration of how LLMs represent and reason about scientific knowledge, particularly in domains such as materials science.
title Layerwise Recall and the Geometry of Interwoven Knowledge in LLMs
topic Computation and Language
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2502.10871