Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Nava, Andres, Wyart, Matthieu
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2605.23821
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910248636252160
author	Nava, Andres Wyart, Matthieu
author_facet	Nava, Andres Wyart, Matthieu
contents	We propose a distributional theory of how hypernymy -- the ``is-a'' relation between general and specific concepts -- is encoded geometrically in language representations. Starting from the empirically verified assumption that words closer on the WordNet hypernym graph co-occur more often, we characterize theoretically the spectrum of the resulting embedding Gram matrix of word2vec embeddings. Under mild positivity and decay conditions on the co-occurrence kernel, we prove that the leading eigenvectors first separate broad taxonomic branches and then progressively finer sub-branches, producing a \emph{hierarchical splitting geometry} with a coarse-to-fine spectral organization that mirrors the tree. We confirm these predictions in word2vec embeddings across many sampled WordNet subtrees, and show that the same signature extends strikingly well to Gemma 2B unembeddings. Our results indicate that hierarchical concept geometry in LLMs need not reflect a hierarchy-specific functional mechanism, but emerges from the spectral structure of pairwise word statistics.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_23821
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Hierarchical Concept Geometry in Language Models Emerges from Word Co-occurrence Nava, Andres Wyart, Matthieu Computation and Language Machine Learning We propose a distributional theory of how hypernymy -- the ``is-a'' relation between general and specific concepts -- is encoded geometrically in language representations. Starting from the empirically verified assumption that words closer on the WordNet hypernym graph co-occur more often, we characterize theoretically the spectrum of the resulting embedding Gram matrix of word2vec embeddings. Under mild positivity and decay conditions on the co-occurrence kernel, we prove that the leading eigenvectors first separate broad taxonomic branches and then progressively finer sub-branches, producing a \emph{hierarchical splitting geometry} with a coarse-to-fine spectral organization that mirrors the tree. We confirm these predictions in word2vec embeddings across many sampled WordNet subtrees, and show that the same signature extends strikingly well to Gemma 2B unembeddings. Our results indicate that hierarchical concept geometry in LLMs need not reflect a hierarchy-specific functional mechanism, but emerges from the spectral structure of pairwise word statistics.
title	Hierarchical Concept Geometry in Language Models Emerges from Word Co-occurrence
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2605.23821

Similar Items