:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qi, Jianing, Tang, Hao, Zhu, Zhigang
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2410.08048
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Q-NL Verifier: Leveraging Synthetic Data for Robust Knowledge Graph Question Answering
by: Schwabe, Tim, et al.
Published: (2025)

Policy Gradient Guidance Enables Test Time Control
by: Qi, Jianing, et al.
Published: (2025)

ATLAS: Adaptive Test-Time Latent Steering with External Verifiers for Enhancing LLMs Reasoning
by: Nguyen, Tuc, et al.
Published: (2026)

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers
by: Seo, Wooseok, et al.
Published: (2025)

AutoPyVerifier: Learning Compact Executable Verifiers for Large Language Model Outputs
by: Pezeshkpour, Pouya, et al.
Published: (2026)

SCI-Verifier: Scientific Verifier with Thinking
by: Zheng, Shenghe, et al.
Published: (2025)

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search
by: Sun, Linzhuang, et al.
Published: (2024)

TrimR: Verifier-based Training-Free Thinking Compression for Efficient Test-Time Scaling
by: Lin, Weizhe, et al.
Published: (2025)

Towards High Data Efficiency in Reinforcement Learning with Verifiable Reward
by: Tang, Xinyu, et al.
Published: (2025)

From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning
by: Huang, Yuzhen, et al.
Published: (2025)

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
by: Setlur, Amrith, et al.
Published: (2024)

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
by: Huang, Guanhua, et al.
Published: (2025)

MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification
by: Sun, Linzhuang, et al.
Published: (2025)

From Reasoning Chains to Verifiable Subproblems: Curriculum Reinforcement Learning Enables Credit Assignment for LLM Reasoning
by: Jiang, Xitai, et al.
Published: (2026)

The Alignment Auditor: A Bayesian Framework for Verifying and Refining LLM Objectives
by: Bou, Matthieu, et al.
Published: (2025)

PVMark: Enabling Public Verifiability for LLM Watermarking Schemes
by: Duan, Haohua, et al.
Published: (2025)

References Improve LLM Alignment in Non-Verifiable Domains
by: Shi, Kejian, et al.
Published: (2026)

Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training
by: Xu, Ran, et al.
Published: (2026)

Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
by: Zhang, Nonghai, et al.
Published: (2026)

When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
by: Singhi, Nishad, et al.
Published: (2025)

HealthQ: Unveiling Questioning Capabilities of LLM Chains in Healthcare Conversations
by: Wang, Ziyu, et al.
Published: (2024)

Verifying Chain-of-Thought Reasoning via Its Computational Graph
by: Zhao, Zheng, et al.
Published: (2025)

Verifying the Robustness of Automatic Credibility Assessment
by: Przybyła, Piotr, et al.
Published: (2023)

Reinforcing General Reasoning without Verifiers
by: Zhou, Xiangxin, et al.
Published: (2025)

Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
by: Zhou, Jin Peng, et al.
Published: (2024)

Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning
by: Yang, Chenghao, et al.
Published: (2025)

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
by: Zeng, Zhiyuan, et al.
Published: (2025)

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
by: Liu, Wei, et al.
Published: (2025)

Compress the Context, Keep the Commitments: A Formal Framework for Verifiable LLM Context Compression
by: Trukhina, Natalia, et al.
Published: (2026)

DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
by: Yin, Jianing, et al.
Published: (2026)

Verifying Computational Graphs in Production-Grade Distributed Machine Learning Frameworks
by: Zulkifli, Kahfi S., et al.
Published: (2025)

Transfer Q Star: Principled Decoding for LLM Alignment
by: Chakraborty, Souradip, et al.
Published: (2024)

Learning to Reason Across Parallel Samples for LLM Reasoning
by: Qi, Jianing, et al.
Published: (2025)

AutoPSV: Automated Process-Supervised Verifier
by: Lu, Jianqiao, et al.
Published: (2024)

vCache: Verified Semantic Prompt Caching
by: Schroeder, Luis Gaspar, et al.
Published: (2025)

On the Query Complexity of Verifier-Assisted Language Generation
by: Botta, Edoardo, et al.
Published: (2025)

FUSE: Ensembling Verifiers with Zero Labeled Data
by: Lee, Joonhyuk, et al.
Published: (2026)

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training
by: Liu, Yixin, et al.
Published: (2026)

On the Ability of Transformers to Verify Plans
by: Sarrof, Yash, et al.
Published: (2026)

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)