Saved in:
| Main Authors: | An, Sohyun, Yuan, Shuibenyang, Lee, Hayeon, Hsieh, Cho-Jui, Min, Alexander |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.12967 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FRESCO: Benchmarking and Optimizing Re-rankers for Evolving Semantic Conflict in Retrieval-Augmented Generation
by: An, Sohyun, et al.
Published: (2026)
by: An, Sohyun, et al.
Published: (2026)
Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
by: An, Sohyun, et al.
Published: (2025)
by: An, Sohyun, et al.
Published: (2025)
AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access
by: Wu, Liwei, et al.
Published: (2026)
by: Wu, Liwei, et al.
Published: (2026)
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search
by: Lee, Hyomin, et al.
Published: (2026)
by: Lee, Hyomin, et al.
Published: (2026)
Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs
by: Hong, Yunqi, et al.
Published: (2025)
by: Hong, Yunqi, et al.
Published: (2025)
One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
by: Wang, Ruochen, et al.
Published: (2024)
by: Wang, Ruochen, et al.
Published: (2024)
IRIS: Intrinsic Reward Image Synthesis
by: Chen, Yihang, et al.
Published: (2025)
by: Chen, Yihang, et al.
Published: (2025)
SmartSearch: Process Reward-Guided Query Refinement for Search Agents
by: Wen, Tongyu, et al.
Published: (2026)
by: Wen, Tongyu, et al.
Published: (2026)
AutoRubric-T2I: Robust Rule-Based Reward Model for Text-to-Image Alignment
by: Kao, Kuei-Chun, et al.
Published: (2026)
by: Kao, Kuei-Chun, et al.
Published: (2026)
Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control
by: Shi, Zhouxing, et al.
Published: (2024)
by: Shi, Zhouxing, et al.
Published: (2024)
Exploring Expert Failures Improves LLM Agent Tuning
by: Lan, Li-Cheng, et al.
Published: (2025)
by: Lan, Li-Cheng, et al.
Published: (2025)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review
by: Prakriya, Neha, et al.
Published: (2024)
by: Prakriya, Neha, et al.
Published: (2024)
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
by: Li, Xirui, et al.
Published: (2024)
by: Li, Xirui, et al.
Published: (2024)
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents
by: Li, Xirui, et al.
Published: (2026)
by: Li, Xirui, et al.
Published: (2026)
Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents
by: Wang, Yiding, et al.
Published: (2025)
by: Wang, Yiding, et al.
Published: (2025)
OptiProxy-NAS: Optimization Proxy based End-to-End Neural Architecture Search
by: Lyu, Bo, et al.
Published: (2025)
by: Lyu, Bo, et al.
Published: (2025)
QG-CoC: Question-Guided Chain-of-Captions for Large Multimodal Models
by: Kao, Kuei-Chun, et al.
Published: (2025)
by: Kao, Kuei-Chun, et al.
Published: (2025)
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?
by: Kao, Kuei-Chun, et al.
Published: (2024)
by: Kao, Kuei-Chun, et al.
Published: (2024)
Defending LLMs against Jailbreaking Attacks via Backtranslation
by: Wang, Yihan, et al.
Published: (2024)
by: Wang, Yihan, et al.
Published: (2024)
InfoFlow: Reinforcing Search Agent Via Reward Density Optimization
by: Luo, Kun, et al.
Published: (2025)
by: Luo, Kun, et al.
Published: (2025)
Expanding Search Space with Diverse Prompting Agents: An Efficient Sampling Approach for LLM Mathematical Reasoning
by: Lee, Gisang, et al.
Published: (2024)
by: Lee, Gisang, et al.
Published: (2024)
RF-Agent: Automated Reward Function Design via Language Agent Tree Search
by: Gao, Ning, et al.
Published: (2026)
by: Gao, Ning, et al.
Published: (2026)
Enhancing Large Language Models with Reward-guided Tree Search for Knowledge Graph Question and Answering
by: Long, Xiao, et al.
Published: (2025)
by: Long, Xiao, et al.
Published: (2025)
UniDEC : Unified Dual Encoder and Classifier Training for Extreme Multi-Label Classification
by: Kharbanda, Siddhant, et al.
Published: (2024)
by: Kharbanda, Siddhant, et al.
Published: (2024)
Enhancing LLM Reasoning with Reward-guided Tree Search
by: Jiang, Jinhao, et al.
Published: (2024)
by: Jiang, Jinhao, et al.
Published: (2024)
OR-Bench: An Over-Refusal Benchmark for Large Language Models
by: Cui, Justin, et al.
Published: (2024)
by: Cui, Justin, et al.
Published: (2024)
One-Forcing: Towards Stable One-Step Autoregressive Video Generation
by: Feng, Jiaqi, et al.
Published: (2026)
by: Feng, Jiaqi, et al.
Published: (2026)
ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions
by: Ye, Jingheng, et al.
Published: (2024)
by: Ye, Jingheng, et al.
Published: (2024)
Plan Before Search: Search Agents Need Plan
by: Qian, Zhipeng, et al.
Published: (2026)
by: Qian, Zhipeng, et al.
Published: (2026)
Dominating vs. Dominated: Generative Collapse in Diffusion Models
by: Jeong, Hayeon, et al.
Published: (2025)
by: Jeong, Hayeon, et al.
Published: (2025)
Dr. Zero: Self-Evolving Search Agents without Training Data
by: Yue, Zhenrui, et al.
Published: (2026)
by: Yue, Zhenrui, et al.
Published: (2026)
DecoupleSearch: Decouple Planning and Search via Hierarchical Reward Modeling
by: Sun, Hao, et al.
Published: (2025)
by: Sun, Hao, et al.
Published: (2025)
ARGS: Alignment as Reward-Guided Search
by: Khanov, Maxim, et al.
Published: (2024)
by: Khanov, Maxim, et al.
Published: (2024)
Mitigating Bias in Dataset Distillation
by: Cui, Justin, et al.
Published: (2024)
by: Cui, Justin, et al.
Published: (2024)
When Is Enough Not Enough? Illusory Completion in Search Agents
by: Ko, Dayoon, et al.
Published: (2026)
by: Ko, Dayoon, et al.
Published: (2026)
IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning
by: Liang, Zihan, et al.
Published: (2026)
by: Liang, Zihan, et al.
Published: (2026)
Bidirectional Bounded-Suboptimal Heuristic Search with Consistent Heuristics
by: Shperberg, Shahaf S., et al.
Published: (2025)
by: Shperberg, Shahaf S., et al.
Published: (2025)
Simulating Human Audiovisual Search Behavior
by: Cho, Hyunsung, et al.
Published: (2026)
by: Cho, Hyunsung, et al.
Published: (2026)
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble
by: Lee, Yujeong, et al.
Published: (2024)
by: Lee, Yujeong, et al.
Published: (2024)
Toward Scalable Verifiable Reward: Proxy State-Based Evaluation for Multi-turn Tool-Calling LLM Agents
by: Chuang, Yun-Shiuan, et al.
Published: (2026)
by: Chuang, Yun-Shiuan, et al.
Published: (2026)
Similar Items
-
FRESCO: Benchmarking and Optimizing Re-rankers for Evolving Semantic Conflict in Retrieval-Augmented Generation
by: An, Sohyun, et al.
Published: (2026) -
Don't Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models
by: An, Sohyun, et al.
Published: (2025) -
AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access
by: Wu, Liwei, et al.
Published: (2026) -
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search
by: Lee, Hyomin, et al.
Published: (2026) -
Unlabeled Data Improves Fine-Grained Image Zero-shot Classification with Multimodal LLMs
by: Hong, Yunqi, et al.
Published: (2025)