Saved in:
Bibliographic Details
Main Authors: Zhang, Dylan, Wang, Justin, Sun, Tianran
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.11901
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909577870573568
author Zhang, Dylan
Wang, Justin
Sun, Tianran
author_facet Zhang, Dylan
Wang, Justin
Sun, Tianran
contents Existing LMs struggle with proof-oriented programming due to data scarcity, which manifest in two key ways: (1) a lack of sufficient corpora for proof-oriented programming languages such as F*, and (2) the absence of large-scale, project-level proof-oriented implementations that can teach the model the intricate reasoning process when performing proof-oriented programming. We present the first on synthetic data augmentation for project level proof oriented programming for both generation and repair. Our method addresses data scarcity by synthesizing basic proof-oriented programming problems for proficiency in that language; incorporating diverse coding data for reasoning capability elicitation and creating new proofs and repair data within existing repositories. This approach enables language models to both synthesize and repair proofs for function- and repository-level code. We show that our fine-tuned 14B parameter model, PoPilot, can exceed the performance of the models that outperforms GPT-4o in project-level proof-oriented programming by 64% relative margin, and can improve GPT-4o's performance by 54% by repairing its outputs over GPT-4o's self-repair.
format Preprint
id arxiv_https___arxiv_org_abs_2502_11901
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Zhang, Dylan
Wang, Justin
Sun, Tianran
Computation and Language
Programming Languages
Software Engineering
Existing LMs struggle with proof-oriented programming due to data scarcity, which manifest in two key ways: (1) a lack of sufficient corpora for proof-oriented programming languages such as F*, and (2) the absence of large-scale, project-level proof-oriented implementations that can teach the model the intricate reasoning process when performing proof-oriented programming. We present the first on synthetic data augmentation for project level proof oriented programming for both generation and repair. Our method addresses data scarcity by synthesizing basic proof-oriented programming problems for proficiency in that language; incorporating diverse coding data for reasoning capability elicitation and creating new proofs and repair data within existing repositories. This approach enables language models to both synthesize and repair proofs for function- and repository-level code. We show that our fine-tuned 14B parameter model, PoPilot, can exceed the performance of the models that outperforms GPT-4o in project-level proof-oriented programming by 64% relative margin, and can improve GPT-4o's performance by 54% by repairing its outputs over GPT-4o's self-repair.
title Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
topic Computation and Language
Programming Languages
Software Engineering
url https://arxiv.org/abs/2502.11901