Saved in:
| Main Authors: | Liu, Yanjiang, Lou, Jie, Guan, Xinyan, Ji, Yuqiu, Lin, Hongyu, He, Ben, Han, Xianpei, Sun, Le, Yu, Xing, Lu, Yaojie |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.30833 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation
by: Wen, Xueru, et al.
Published: (2024)
by: Wen, Xueru, et al.
Published: (2024)
Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation
by: Yuan, Qianhao, et al.
Published: (2026)
by: Yuan, Qianhao, et al.
Published: (2026)
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
by: Guan, Xinyan, et al.
Published: (2024)
by: Guan, Xinyan, et al.
Published: (2024)
REInstruct: Building Instruction Data from Unlabeled Corpus
by: Chen, Shu, et al.
Published: (2024)
by: Chen, Shu, et al.
Published: (2024)
You Can't Fight in Here! This is BBS!
by: Futrell, Richard, et al.
Published: (2026)
by: Futrell, Richard, et al.
Published: (2026)
SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency
by: Yuan, Qianhao, et al.
Published: (2025)
by: Yuan, Qianhao, et al.
Published: (2025)
Coupled Variational Reinforcement Learning for Language Model General Reasoning
by: Wen, Xueru, et al.
Published: (2025)
by: Wen, Xueru, et al.
Published: (2025)
Your Teaching Can't Help
by: Rebecca Weaver
Published: (2024)
by: Rebecca Weaver
Published: (2024)
Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch
by: Wen, Xueru, et al.
Published: (2025)
by: Wen, Xueru, et al.
Published: (2025)
The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models
by: Li, Zichao, et al.
Published: (2025)
by: Li, Zichao, et al.
Published: (2025)
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
by: Zheng, Xin, et al.
Published: (2024)
by: Zheng, Xin, et al.
Published: (2024)
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models
by: Liu, Yanjiang, et al.
Published: (2025)
by: Liu, Yanjiang, et al.
Published: (2025)
Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards
by: Ren, Mengjie, et al.
Published: (2026)
by: Ren, Mengjie, et al.
Published: (2026)
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
by: Zheng, Hao, et al.
Published: (2025)
by: Zheng, Hao, et al.
Published: (2025)
ConsistentChat: Building Skeleton-Guided Consistent Multi-Turn Dialogues for Large Language Models from Scratch
by: Chen, Jiawei, et al.
Published: (2025)
by: Chen, Jiawei, et al.
Published: (2025)
Expanding the Boundaries of Vision Prior Knowledge in Multi-modal Large Language Models
by: Liang, Qiao, et al.
Published: (2025)
by: Liang, Qiao, et al.
Published: (2025)
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
by: Bian, Ning, et al.
Published: (2024)
by: Bian, Ning, et al.
Published: (2024)
DeepRAG: Thinking to Retrieve Step by Step for Large Language Models
by: Guan, Xinyan, et al.
Published: (2025)
by: Guan, Xinyan, et al.
Published: (2025)
When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models
by: Mao, Yingzhi, et al.
Published: (2025)
by: Mao, Yingzhi, et al.
Published: (2025)
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers
by: Yuan, Qianhao, et al.
Published: (2025)
by: Yuan, Qianhao, et al.
Published: (2025)
You Can't Get a Library Card if You're Homeless.
Published: (1989)
Published: (1989)
Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?
by: Wen, Xueru, et al.
Published: (2024)
by: Wen, Xueru, et al.
Published: (2024)
You Can Help Your Country
by: Mayall, Berry, et al.
Published: (2021)
by: Mayall, Berry, et al.
Published: (2021)
You Can't Get There From Here: Redefining Information Science to address our sociotechnical futures
by: Humr, Scott, et al.
Published: (2025)
by: Humr, Scott, et al.
Published: (2025)
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
by: Mo, Guozhao, et al.
Published: (2025)
by: Mo, Guozhao, et al.
Published: (2025)
LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks
by: Kambhampati, Subbarao, et al.
Published: (2024)
by: Kambhampati, Subbarao, et al.
Published: (2024)
Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction
by: Zhong, Tianyun, et al.
Published: (2025)
by: Zhong, Tianyun, et al.
Published: (2025)
You're Not from Around Here, Are You?
by: Blum, Louise A.
Published: (2023)
by: Blum, Louise A.
Published: (2023)
Executing Natural Language-Described Algorithms with Large Language Models: An Investigation
by: Zheng, Xin, et al.
Published: (2024)
by: Zheng, Xin, et al.
Published: (2024)
Meta-Cognitive Analysis: Evaluating Declarative and Procedural Knowledge in Datasets and Large Language Models
by: Li, Zhuoqun, et al.
Published: (2024)
by: Li, Zhuoqun, et al.
Published: (2024)
Open Grounded Planning: Challenges and Benchmark Construction
by: Guo, Shiguang, et al.
Published: (2024)
by: Guo, Shiguang, et al.
Published: (2024)
"If It's Not Here, I Can't Be Bothered...": Limiting Searches to In-House Journals.
by: Davidoff, Donna J., et al.
Published: (1991)
by: Davidoff, Donna J., et al.
Published: (1991)
The Struggle You Can’t See
by: Lierman, Ash
Published: (2024)
by: Lierman, Ash
Published: (2024)
P^2O: Joint Policy and Prompt Optimization
by: Lu, Xinyu, et al.
Published: (2026)
by: Lu, Xinyu, et al.
Published: (2026)
You Can't Get There from Here: Issues in Remote Access to Electronic Journals for a Health Sciences Library.
by: Krieb, Dennis
Published: (1999)
by: Krieb, Dennis
Published: (1999)
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
by: Men, Xin, et al.
Published: (2024)
by: Men, Xin, et al.
Published: (2024)
ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models
by: Bian, Ning, et al.
Published: (2023)
by: Bian, Ning, et al.
Published: (2023)
MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning
by: Yuan, Qianhao, et al.
Published: (2025)
by: Yuan, Qianhao, et al.
Published: (2025)
ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models
by: Zheng, Jiasheng, et al.
Published: (2026)
by: Zheng, Jiasheng, et al.
Published: (2026)
Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
by: Xu, Ruoxi, et al.
Published: (2025)
by: Xu, Ruoxi, et al.
Published: (2025)
Similar Items
-
On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation
by: Wen, Xueru, et al.
Published: (2024) -
Vision-OPD: Learning to See Fine Details for Multimodal LLMs via On-Policy Self-Distillation
by: Yuan, Qianhao, et al.
Published: (2026) -
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
by: Guan, Xinyan, et al.
Published: (2024) -
REInstruct: Building Instruction Data from Unlabeled Corpus
by: Chen, Shu, et al.
Published: (2024) -
You Can't Fight in Here! This is BBS!
by: Futrell, Richard, et al.
Published: (2026)