Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lu, Yanzhen, Yang, Hanbin, Wang, Xiaodie, Zhang, Ge, Li, Biao, Fu, Chenxu, Li, Chao, Yuan, Yang, Yao, Andrew Chi-Chih
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2507.02541
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909674829250560
author	Lu, Yanzhen Yang, Hanbin Wang, Xiaodie Zhang, Ge Li, Biao Fu, Chenxu Li, Chao Yuan, Yang Yao, Andrew Chi-Chih
author_facet	Lu, Yanzhen Yang, Hanbin Wang, Xiaodie Zhang, Ge Li, Biao Fu, Chenxu Li, Chao Yuan, Yang Yao, Andrew Chi-Chih
contents	In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85$\times$ improvement in clarity score (44.5\%~$\rightarrow$~82.3\%). Using the general-purpose model \texttt{DeepSeek-V3}, our approach leads to a 2.1$\times$ improvement in proof success (21.8\%~$\rightarrow$~45.8\%) and outperforms the previous state-of-the-art \texttt{Graph2Tac} (33.2\%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as \texttt{Graph2Tac}. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6\%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner--Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_02541
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Clarifying Before Reasoning: A Coq Prover with Structural Context Lu, Yanzhen Yang, Hanbin Wang, Xiaodie Zhang, Ge Li, Biao Fu, Chenxu Li, Chao Yuan, Yang Yao, Andrew Chi-Chih Artificial Intelligence In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85$\times$ improvement in clarity score (44.5\%~$\rightarrow$~82.3\%). Using the general-purpose model \texttt{DeepSeek-V3}, our approach leads to a 2.1$\times$ improvement in proof success (21.8\%~$\rightarrow$~45.8\%) and outperforms the previous state-of-the-art \texttt{Graph2Tac} (33.2\%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as \texttt{Graph2Tac}. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6\%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner--Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.
title	Clarifying Before Reasoning: A Coq Prover with Structural Context
topic	Artificial Intelligence
url	https://arxiv.org/abs/2507.02541

Similar Items