Saved in:
Bibliographic Details
Main Authors: Sonoda, Sho, Akiyama, Shunta, Uezato, Yuya
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.10512
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909023160238080
author Sonoda, Sho
Akiyama, Shunta
Uezato, Yuya
author_facet Sonoda, Sho
Akiyama, Shunta
Uezato, Yuya
contents Agentic theorem provers often introduce intermediate lemmas, proof sketches, or subgoal decompositions before returning to tactic-level search. This can look like an expensive detour: if proving lemmas is itself hard, why should a learned prover spend effort there? We give a statistical learning answer. Instead of worst-case proof complexity over all formulas, we study the biased data distribution produced by a teacher prover: initial theorem states together with successful verified proof traces. We model proof search as a deterministic finite-horizon MDP and analyze offline imitation learning from those traces. The success bounds depend on the average length of teacher proofs, how predictable the teacher's next action is, and how accurately the student learns that local prediction problem. A flat student learns from fully inlined traces, so repeated subproofs appear many times in its training and test-time certificate. A hierarchical student instead predicts a reusable proof DAG and solves each shared block once. When flattening duplicates the same hard local argument exponentially many times, the sufficient-sample certificate produced by our bounds can be exponentially smaller for the hierarchical learner. This gives a concrete statistical mechanism by which reusable proof structure helps verifier-based theorem proving.
format Preprint
id arxiv_https___arxiv_org_abs_2602_10512
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Exponential Sample Complexity Separation between Flat and Hierarchical Agentic Theorem Provers
Sonoda, Sho
Akiyama, Shunta
Uezato, Yuya
Machine Learning
Logic in Computer Science
Agentic theorem provers often introduce intermediate lemmas, proof sketches, or subgoal decompositions before returning to tactic-level search. This can look like an expensive detour: if proving lemmas is itself hard, why should a learned prover spend effort there? We give a statistical learning answer. Instead of worst-case proof complexity over all formulas, we study the biased data distribution produced by a teacher prover: initial theorem states together with successful verified proof traces. We model proof search as a deterministic finite-horizon MDP and analyze offline imitation learning from those traces. The success bounds depend on the average length of teacher proofs, how predictable the teacher's next action is, and how accurately the student learns that local prediction problem. A flat student learns from fully inlined traces, so repeated subproofs appear many times in its training and test-time certificate. A hierarchical student instead predicts a reusable proof DAG and solves each shared block once. When flattening duplicates the same hard local argument exponentially many times, the sufficient-sample certificate produced by our bounds can be exponentially smaller for the hierarchical learner. This gives a concrete statistical mechanism by which reusable proof structure helps verifier-based theorem proving.
title Exponential Sample Complexity Separation between Flat and Hierarchical Agentic Theorem Provers
topic Machine Learning
Logic in Computer Science
url https://arxiv.org/abs/2602.10512