Saved in:
| Main Authors: | Chen, Ziru, White, Michael, Mooney, Raymond, Payani, Ali, Su, Yu, Sun, Huan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.10890 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LiteSearch: Efficacious Tree Search for LLM
by: Wang, Ante, et al.
Published: (2024)
by: Wang, Ante, et al.
Published: (2024)
Program Semantic Inequivalence Game with Large Language Models
by: Miceli-Barone, Antonio Valerio, et al.
Published: (2025)
by: Miceli-Barone, Antonio Valerio, et al.
Published: (2025)
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation
by: Chen, Ziru, et al.
Published: (2026)
by: Chen, Ziru, et al.
Published: (2026)
Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages
by: Li, Zihao, et al.
Published: (2024)
by: Li, Zihao, et al.
Published: (2024)
Rep2Text: Decoding Full Text from a Single LLM Token Representation
by: Zhao, Haiyan, et al.
Published: (2025)
by: Zhao, Haiyan, et al.
Published: (2025)
D3-Gym: Constructing Real-World Verifiable Environments for Data-Driven Discovery
by: Moussa, Hanane Nour, et al.
Published: (2026)
by: Moussa, Hanane Nour, et al.
Published: (2026)
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
by: Zhao, Haiyan, et al.
Published: (2024)
by: Zhao, Haiyan, et al.
Published: (2024)
Planning In Natural Language Improves LLM Search For Code Generation
by: Wang, Evan, et al.
Published: (2024)
by: Wang, Evan, et al.
Published: (2024)
Joint Detection of Fraud and Concept Drift inOnline Conversations with LLM-Assisted Judgment
by: Senol, Ali, et al.
Published: (2025)
by: Senol, Ali, et al.
Published: (2025)
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
by: Xu, Yixuan Even, et al.
Published: (2025)
by: Xu, Yixuan Even, et al.
Published: (2025)
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
by: He, Zirui, et al.
Published: (2025)
by: He, Zirui, et al.
Published: (2025)
When Refusals Fail: Unstable Safety Mechanisms in Long-Context LLM Agents
by: Hadeliya, Tsimur, et al.
Published: (2025)
by: Hadeliya, Tsimur, et al.
Published: (2025)
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
by: Chen, Ziru, et al.
Published: (2024)
by: Chen, Ziru, et al.
Published: (2024)
Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents
by: Zhu, Mingkang, et al.
Published: (2025)
by: Zhu, Mingkang, et al.
Published: (2025)
Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents
by: Wang, Zehong, et al.
Published: (2026)
by: Wang, Zehong, et al.
Published: (2026)
When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training
by: Chen, Sanxing, et al.
Published: (2025)
by: Chen, Sanxing, et al.
Published: (2025)
Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation
by: Mooney, James, et al.
Published: (2025)
by: Mooney, James, et al.
Published: (2025)
PerPO: Perceptual Preference Optimization via Discriminative Rewarding
by: Zhu, Zining, et al.
Published: (2025)
by: Zhu, Zining, et al.
Published: (2025)
The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination
by: Sun, Yifan, et al.
Published: (2025)
by: Sun, Yifan, et al.
Published: (2025)
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
by: Sun, Yifan, et al.
Published: (2025)
by: Sun, Yifan, et al.
Published: (2025)
Rethinking the Unsolvable: When In-Context Search Meets Test-Time Scaling
by: Xia, Fanzeng, et al.
Published: (2025)
by: Xia, Fanzeng, et al.
Published: (2025)
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
by: Singhi, Nishad, et al.
Published: (2025)
by: Singhi, Nishad, et al.
Published: (2025)
Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval
by: Chang, Hangeol, et al.
Published: (2026)
by: Chang, Hangeol, et al.
Published: (2026)
A2P-Vis: an Analyzer-to-Presenter Agentic Pipeline for Visual Insights Generation and Reporting
by: Gan, Shuyu, et al.
Published: (2025)
by: Gan, Shuyu, et al.
Published: (2025)
Tree Search for Language Model Agents
by: Koh, Jing Yu, et al.
Published: (2024)
by: Koh, Jing Yu, et al.
Published: (2024)
DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation
by: Li, Pingzhi, et al.
Published: (2025)
by: Li, Pingzhi, et al.
Published: (2025)
WebOperator: Action-Aware Tree Search for Autonomous Agents in Web Environment
by: Dihan, Mahir Labib, et al.
Published: (2025)
by: Dihan, Mahir Labib, et al.
Published: (2025)
CaRT: Teaching LLM Agents to Know When They Know Enough
by: Liu, Grace, et al.
Published: (2025)
by: Liu, Grace, et al.
Published: (2025)
STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents
by: Li, Jing-Jing, et al.
Published: (2025)
by: Li, Jing-Jing, et al.
Published: (2025)
Preference Leakage: A Contamination Problem in LLM-as-a-judge
by: Li, Dawei, et al.
Published: (2025)
by: Li, Dawei, et al.
Published: (2025)
Towards LLM-guided Causal Explainability for Black-box Text Classifiers
by: Bhattacharjee, Amrita, et al.
Published: (2023)
by: Bhattacharjee, Amrita, et al.
Published: (2023)
Budget-aware Test-time Scaling via Discriminative Verification
by: Montgomery, Kyle, et al.
Published: (2025)
by: Montgomery, Kyle, et al.
Published: (2025)
MPO: Boosting LLM Agents with Meta Plan Optimization
by: Xiong, Weimin, et al.
Published: (2025)
by: Xiong, Weimin, et al.
Published: (2025)
Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents
by: Zambrano, Alejandra, et al.
Published: (2026)
by: Zambrano, Alejandra, et al.
Published: (2026)
When LLM Judge Scores Look Good but Best-of-N Decisions Fail
by: Landesberg, Eddie
Published: (2026)
by: Landesberg, Eddie
Published: (2026)
Context-Enhanced Contrastive Search for Improved LLM Text Generation
by: Sen, Jaydip, et al.
Published: (2025)
by: Sen, Jaydip, et al.
Published: (2025)
Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation
by: Bhattacharjee, Amrita, et al.
Published: (2024)
by: Bhattacharjee, Amrita, et al.
Published: (2024)
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
by: Liu, Max, et al.
Published: (2024)
by: Liu, Max, et al.
Published: (2024)
Advancing LLM Reasoning Generalists with Preference Trees
by: Yuan, Lifan, et al.
Published: (2024)
by: Yuan, Lifan, et al.
Published: (2024)
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
by: Zhou, Andy, et al.
Published: (2023)
by: Zhou, Andy, et al.
Published: (2023)
Similar Items
-
LiteSearch: Efficacious Tree Search for LLM
by: Wang, Ante, et al.
Published: (2024) -
Program Semantic Inequivalence Game with Large Language Models
by: Miceli-Barone, Antonio Valerio, et al.
Published: (2025) -
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation
by: Chen, Ziru, et al.
Published: (2026) -
Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages
by: Li, Zihao, et al.
Published: (2024) -
Rep2Text: Decoding Full Text from a Single LLM Token Representation
by: Zhao, Haiyan, et al.
Published: (2025)