Saved in:
Bibliographic Details
Main Authors: Li, Xinze, Peng, Nanyun, Severini, Simone, Shafto, Patrick
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.24797
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911645663494144
author Li, Xinze
Peng, Nanyun
Severini, Simone
Shafto, Patrick
author_facet Li, Xinze
Peng, Nanyun
Severini, Simone
Shafto, Patrick
contents The ongoing development of Lean 4's Mathlib has produced a macroscopic structural complexity that interweaves logical, mathematical, and infrastructural dependencies. We present a network analysis of this library, extracting its dependency structure into a multilayer graph of 308,129 declarations, 8.4 million edges, and 7,563 modules. By introducing graph decompositions that isolate explicit edges from those synthesized by the compiler or driven by proofs, we quantify the structural properties of formalized mathematics. Our analysis reveals three findings. First, taxonomies designed by humans diverge from logical structures, exhibiting a 50.9% coupling across namespaces. Second, developers utilize a median of 1.6% of the imported scope. Third, formalization compresses semantic hierarchies, with network centrality capturing language infrastructure rather than mathematical relevance.
format Preprint
id arxiv_https___arxiv_org_abs_2604_24797
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle The Network Structure of Mathlib
Li, Xinze
Peng, Nanyun
Severini, Simone
Shafto, Patrick
Logic in Computer Science
Programming Languages
Social and Information Networks
History and Overview
68V15, 05C82, 03B35
The ongoing development of Lean 4's Mathlib has produced a macroscopic structural complexity that interweaves logical, mathematical, and infrastructural dependencies. We present a network analysis of this library, extracting its dependency structure into a multilayer graph of 308,129 declarations, 8.4 million edges, and 7,563 modules. By introducing graph decompositions that isolate explicit edges from those synthesized by the compiler or driven by proofs, we quantify the structural properties of formalized mathematics. Our analysis reveals three findings. First, taxonomies designed by humans diverge from logical structures, exhibiting a 50.9% coupling across namespaces. Second, developers utilize a median of 1.6% of the imported scope. Third, formalization compresses semantic hierarchies, with network centrality capturing language infrastructure rather than mathematical relevance.
title The Network Structure of Mathlib
topic Logic in Computer Science
Programming Languages
Social and Information Networks
History and Overview
68V15, 05C82, 03B35
url https://arxiv.org/abs/2604.24797