Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sonoda, Sho, Akiyama, Shunta, Uezato, Yuya
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Logic in Computer Science
Online Access:	https://arxiv.org/abs/2602.10512
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909023160238080
author	Sonoda, Sho Akiyama, Shunta Uezato, Yuya
author_facet	Sonoda, Sho Akiyama, Shunta Uezato, Yuya
contents	Agentic theorem provers often introduce intermediate lemmas, proof sketches, or subgoal decompositions before returning to tactic-level search. This can look like an expensive detour: if proving lemmas is itself hard, why should a learned prover spend effort there? We give a statistical learning answer. Instead of worst-case proof complexity over all formulas, we study the biased data distribution produced by a teacher prover: initial theorem states together with successful verified proof traces. We model proof search as a deterministic finite-horizon MDP and analyze offline imitation learning from those traces. The success bounds depend on the average length of teacher proofs, how predictable the teacher's next action is, and how accurately the student learns that local prediction problem. A flat student learns from fully inlined traces, so repeated subproofs appear many times in its training and test-time certificate. A hierarchical student instead predicts a reusable proof DAG and solves each shared block once. When flattening duplicates the same hard local argument exponentially many times, the sufficient-sample certificate produced by our bounds can be exponentially smaller for the hierarchical learner. This gives a concrete statistical mechanism by which reusable proof structure helps verifier-based theorem proving.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_10512
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Exponential Sample Complexity Separation between Flat and Hierarchical Agentic Theorem Provers Sonoda, Sho Akiyama, Shunta Uezato, Yuya Machine Learning Logic in Computer Science Agentic theorem provers often introduce intermediate lemmas, proof sketches, or subgoal decompositions before returning to tactic-level search. This can look like an expensive detour: if proving lemmas is itself hard, why should a learned prover spend effort there? We give a statistical learning answer. Instead of worst-case proof complexity over all formulas, we study the biased data distribution produced by a teacher prover: initial theorem states together with successful verified proof traces. We model proof search as a deterministic finite-horizon MDP and analyze offline imitation learning from those traces. The success bounds depend on the average length of teacher proofs, how predictable the teacher's next action is, and how accurately the student learns that local prediction problem. A flat student learns from fully inlined traces, so repeated subproofs appear many times in its training and test-time certificate. A hierarchical student instead predicts a reusable proof DAG and solves each shared block once. When flattening duplicates the same hard local argument exponentially many times, the sufficient-sample certificate produced by our bounds can be exponentially smaller for the hierarchical learner. This gives a concrete statistical mechanism by which reusable proof structure helps verifier-based theorem proving.
title	Exponential Sample Complexity Separation between Flat and Hierarchical Agentic Theorem Provers
topic	Machine Learning Logic in Computer Science
url	https://arxiv.org/abs/2602.10512

Similar Items