Saved in:
| Main Authors: | Xia, Yuan, Atrey, Akanksha, Khmaissia, Fadoua, Namjoshi, Kedar S. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.20213 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
by: Chung, Tsz Ting, et al.
Published: (2025)
by: Chung, Tsz Ting, et al.
Published: (2025)
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
by: Yu, Zhouliang, et al.
Published: (2025)
by: Yu, Zhouliang, et al.
Published: (2025)
Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling
by: Da Costa, Lancelot, et al.
Published: (2025)
by: Da Costa, Lancelot, et al.
Published: (2025)
A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Models
by: Yuan, Yefeng, et al.
Published: (2024)
by: Yuan, Yefeng, et al.
Published: (2024)
Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries
by: Chen, Luoxin, et al.
Published: (2026)
by: Chen, Luoxin, et al.
Published: (2026)
A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
by: Lubana, Ekdeep Singh, et al.
Published: (2024)
by: Lubana, Ekdeep Singh, et al.
Published: (2024)
Provable Training Data Identification for Large Language Models
by: Liu, Zhenlong, et al.
Published: (2025)
by: Liu, Zhenlong, et al.
Published: (2025)
Demystifying the Accuracy-Interpretability Trade-Off: A Case Study of Inferring Ratings from Reviews
by: Atrey, Pranjal, et al.
Published: (2025)
by: Atrey, Pranjal, et al.
Published: (2025)
Rewarding Intellectual Humility Learning When Not To Answer In Large Language Models
by: Jha, Abha, et al.
Published: (2026)
by: Jha, Abha, et al.
Published: (2026)
vTrain: A Simulation Framework for Evaluating Cost-effective and Compute-optimal Large Language Model Training
by: Bang, Jehyeon, et al.
Published: (2023)
by: Bang, Jehyeon, et al.
Published: (2023)
LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models
by: Hao, Qianyue, et al.
Published: (2025)
by: Hao, Qianyue, et al.
Published: (2025)
Formal Logic Enabled Personalized Federated Learning Through Property Inference
by: An, Ziyan, et al.
Published: (2024)
by: An, Ziyan, et al.
Published: (2024)
An Interpretable and Scalable Framework for Evaluating Large Language Models
by: Qu, Xinhao, et al.
Published: (2026)
by: Qu, Xinhao, et al.
Published: (2026)
LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework
by: Qiao, Yiran, et al.
Published: (2024)
by: Qiao, Yiran, et al.
Published: (2024)
A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models
by: Wu, Yixi, et al.
Published: (2024)
by: Wu, Yixi, et al.
Published: (2024)
GWT: Scalable Optimizer State Compression for Large Language Model Training
by: Wen, Ziqing, et al.
Published: (2025)
by: Wen, Ziqing, et al.
Published: (2025)
FormalProofBench: Can Models Write Graduate Level Math Proofs That Are Formally Verified?
by: Ravi, Nikil, et al.
Published: (2026)
by: Ravi, Nikil, et al.
Published: (2026)
Regurgitative Training: The Value of Real Data in Training Large Language Models
by: Zhang, Jinghui, et al.
Published: (2024)
by: Zhang, Jinghui, et al.
Published: (2024)
Efficient Learning of Fuzzy Logic Systems for Large-Scale Data Using Deep Learning
by: Koklu, Ata, et al.
Published: (2024)
by: Koklu, Ata, et al.
Published: (2024)
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment
by: Li, Cheryl, et al.
Published: (2025)
by: Li, Cheryl, et al.
Published: (2025)
Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
by: Dong, Yihong, et al.
Published: (2026)
by: Dong, Yihong, et al.
Published: (2026)
Learning from Generalization Patterns: An Evaluation-Driven Approach to Enhanced Data Augmentation for Fine-Tuning Small Language Models
by: Song, Huan, et al.
Published: (2025)
by: Song, Huan, et al.
Published: (2025)
Training and Evaluating Language Models with Template-based Data Generation
by: Zhang, Yifan
Published: (2024)
by: Zhang, Yifan
Published: (2024)
SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond
by: Zhu, Xiangyang, et al.
Published: (2026)
by: Zhu, Xiangyang, et al.
Published: (2026)
Towards a Mechanistic Understanding of Propositional Logical Reasoning in Large Language Models
by: Chen, Danchun, et al.
Published: (2026)
by: Chen, Danchun, et al.
Published: (2026)
Can Large Language Models Reason and Optimize Under Constraints?
by: Bernier, Fabien, et al.
Published: (2026)
by: Bernier, Fabien, et al.
Published: (2026)
Persuade Me if You Can: A Framework for Evaluating Persuasion Effectiveness and Susceptibility Among Large Language Models
by: Bozdag, Nimet Beyza, et al.
Published: (2025)
by: Bozdag, Nimet Beyza, et al.
Published: (2025)
SODA: Protecting Proprietary Information in On-Device Machine Learning Models
by: Atrey, Akanksha, et al.
Published: (2023)
by: Atrey, Akanksha, et al.
Published: (2023)
GraphEdit: Large Language Models for Graph Structure Learning
by: Guo, Zirui, et al.
Published: (2024)
by: Guo, Zirui, et al.
Published: (2024)
Data-centric Federated Graph Learning with Large Language Models
by: Yan, Bo, et al.
Published: (2025)
by: Yan, Bo, et al.
Published: (2025)
Dynamic Adversarial Reinforcement Learning for Robust Multimodal Large Language Models
by: Bao, Yicheng, et al.
Published: (2026)
by: Bao, Yicheng, et al.
Published: (2026)
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
by: Li, Chenyue, et al.
Published: (2025)
by: Li, Chenyue, et al.
Published: (2025)
OptProver: Bridging Olympiad and Optimization through Continual Training in Formal Theorem Proving
by: Li, Chenyi, et al.
Published: (2026)
by: Li, Chenyi, et al.
Published: (2026)
Can Large Language Models Develop Strategic Reasoning? Post-training Insights from Learning Chess
by: Hwang, Dongyoon, et al.
Published: (2025)
by: Hwang, Dongyoon, et al.
Published: (2025)
MLB: A Scenario-Driven Benchmark for Evaluating Large Language Models in Clinical Applications
by: He, Qing, et al.
Published: (2026)
by: He, Qing, et al.
Published: (2026)
Escaping Collapse: The Strength of Weak Data for Large Language Model Training
by: Amin, Kareem, et al.
Published: (2025)
by: Amin, Kareem, et al.
Published: (2025)
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
by: Majumder, Bodhisattwa Prasad, et al.
Published: (2024)
by: Majumder, Bodhisattwa Prasad, et al.
Published: (2024)
Engagement-Driven Content Generation with Large Language Models
by: Coppolillo, Erica, et al.
Published: (2024)
by: Coppolillo, Erica, et al.
Published: (2024)
LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models
by: He, Kang, et al.
Published: (2025)
by: He, Kang, et al.
Published: (2025)
A Training Data Recipe to Accelerate A* Search with Language Models
by: Gupta, Devaansh, et al.
Published: (2024)
by: Gupta, Devaansh, et al.
Published: (2024)
Similar Items
-
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
by: Chung, Tsz Ting, et al.
Published: (2025) -
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
by: Yu, Zhouliang, et al.
Published: (2025) -
Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling
by: Da Costa, Lancelot, et al.
Published: (2025) -
A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Models
by: Yuan, Yefeng, et al.
Published: (2024) -
Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries
by: Chen, Luoxin, et al.
Published: (2026)