Saved in:
| Main Authors: | Wang, Lu, Du, Chao, Zhao, Pu, Luo, Chuan, Zhu, Zhangchi, Qiao, Bo, Zhang, Wei, Lin, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.08690 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals
by: Sun, Lihao, et al.
Published: (2026)
by: Sun, Lihao, et al.
Published: (2026)
Pretrain Value, Not Reward: Decoupled Value Policy Optimization
by: Huang, Chenghua, et al.
Published: (2025)
by: Huang, Chenghua, et al.
Published: (2025)
Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs
by: Wang, Qibin, et al.
Published: (2025)
by: Wang, Qibin, et al.
Published: (2025)
From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models
by: Zhang, Jue, et al.
Published: (2025)
by: Zhang, Jue, et al.
Published: (2025)
COIN: Chance-Constrained Imitation Learning for Uncertainty-aware Adaptive Resource Oversubscription Policy
by: Wang, Lu, et al.
Published: (2024)
by: Wang, Lu, et al.
Published: (2024)
Beyond State Consistency: Behavior Consistency in Text-Based World Models
by: Huang, Youling, et al.
Published: (2026)
by: Huang, Youling, et al.
Published: (2026)
Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks
by: Tan, Rongyuan, et al.
Published: (2026)
by: Tan, Rongyuan, et al.
Published: (2026)
An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing
by: Dong, Hang, et al.
Published: (2024)
by: Dong, Hang, et al.
Published: (2024)
VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model
by: Zheng, Jiani, et al.
Published: (2025)
by: Zheng, Jiani, et al.
Published: (2025)
Self-Evolved Reward Learning for LLMs
by: Huang, Chenghua, et al.
Published: (2024)
by: Huang, Chenghua, et al.
Published: (2024)
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation
by: Ding, Ruomeng, et al.
Published: (2023)
by: Ding, Ruomeng, et al.
Published: (2023)
AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
by: Lu, Junting, et al.
Published: (2024)
by: Lu, Junting, et al.
Published: (2024)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
by: Hu, Mengkang, et al.
Published: (2024)
by: Hu, Mengkang, et al.
Published: (2024)
RuAG: Learned-rule-augmented Generation for Large Language Models
by: Zhang, Yudi, et al.
Published: (2024)
by: Zhang, Yudi, et al.
Published: (2024)
Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention
by: Liao, Mengqi, et al.
Published: (2026)
by: Liao, Mengqi, et al.
Published: (2026)
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
by: Feng, Huawen, et al.
Published: (2024)
by: Feng, Huawen, et al.
Published: (2024)
Text2Grad: Reinforcement Learning from Natural Language Feedback
by: Wang, Hanyang, et al.
Published: (2025)
by: Wang, Hanyang, et al.
Published: (2025)
MEETING DELEGATE: Benchmarking LLMs on Attending Meetings on Our Behalf
by: Hu, Lingxiang, et al.
Published: (2025)
by: Hu, Lingxiang, et al.
Published: (2025)
DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems
by: Ma, Ming, et al.
Published: (2025)
by: Ma, Ming, et al.
Published: (2025)
AdaptFlow: Adaptive Workflow Optimization via Meta-Learning
by: Zhu, Runchuan, et al.
Published: (2025)
by: Zhu, Runchuan, et al.
Published: (2025)
Token-level Proximal Policy Optimization for Query Generation
by: Ouyang, Yichen, et al.
Published: (2024)
by: Ouyang, Yichen, et al.
Published: (2024)
ExeCoder: Empowering Large Language Models with Executability Representation for Code Translation
by: He, Minghua, et al.
Published: (2025)
by: He, Minghua, et al.
Published: (2025)
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks
by: Peng, Yingzhe, et al.
Published: (2024)
by: Peng, Yingzhe, et al.
Published: (2024)
The Vision of Autonomic Computing: Can LLMs Make It a Reality?
by: Zhang, Zhiyang, et al.
Published: (2024)
by: Zhang, Zhiyang, et al.
Published: (2024)
RepoGenesis: Benchmarking End-to-End Microservice Generation from Readme to Repository
by: Peng, Zhiyuan, et al.
Published: (2026)
by: Peng, Zhiyuan, et al.
Published: (2026)
Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency Perspective
by: Wang, Zexin, et al.
Published: (2024)
by: Wang, Zexin, et al.
Published: (2024)
Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones?
by: Zhang, Yudi, et al.
Published: (2025)
by: Zhang, Yudi, et al.
Published: (2025)
AI Delegates with a Dual Focus: Ensuring Privacy and Strategic Self-Disclosure
by: Zhang, Zhiyang, et al.
Published: (2024)
by: Zhang, Zhiyang, et al.
Published: (2024)
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation
by: Fu, Jia, et al.
Published: (2024)
by: Fu, Jia, et al.
Published: (2024)
Sharingan: Extract User Action Sequence from Desktop Recordings
by: Chen, Yanting, et al.
Published: (2024)
by: Chen, Yanting, et al.
Published: (2024)
UFO: A UI-Focused Agent for Windows OS Interaction
by: Zhang, Chaoyun, et al.
Published: (2024)
by: Zhang, Chaoyun, et al.
Published: (2024)
Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents
by: Wang, Shouju, et al.
Published: (2025)
by: Wang, Shouju, et al.
Published: (2025)
CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents
by: Fu, Wenjie, et al.
Published: (2026)
by: Fu, Wenjie, et al.
Published: (2026)
API Agents vs. GUI Agents: Divergence and Convergence
by: Zhang, Chaoyun, et al.
Published: (2025)
by: Zhang, Chaoyun, et al.
Published: (2025)
Enabling Autonomic Microservice Management through Self-Learning Agents
by: Yu, Fenglin, et al.
Published: (2025)
by: Yu, Fenglin, et al.
Published: (2025)
Multi-Preference Optimization: Generalizing DPO via Set-Level Contrasts
by: Gupta, Taneesh, et al.
Published: (2024)
by: Gupta, Taneesh, et al.
Published: (2024)
Exploring Feature-based Knowledge Distillation for Recommender System: A Frequency Perspective
by: Zhu, Zhangchi, et al.
Published: (2024)
by: Zhu, Zhangchi, et al.
Published: (2024)
Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction
by: Li, Haozhe, et al.
Published: (2024)
by: Li, Haozhe, et al.
Published: (2024)
UFO3: Weaving the Digital Agent Galaxy
by: Zhang, Chaoyun, et al.
Published: (2025)
by: Zhang, Chaoyun, et al.
Published: (2025)
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering
by: Zhuang, Ziyuan, et al.
Published: (2024)
by: Zhuang, Ziyuan, et al.
Published: (2024)
Similar Items
-
LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals
by: Sun, Lihao, et al.
Published: (2026) -
Pretrain Value, Not Reward: Decoupled Value Policy Optimization
by: Huang, Chenghua, et al.
Published: (2025) -
Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs
by: Wang, Qibin, et al.
Published: (2025) -
From Reasoning to Answer: Empirical, Attention-Based and Mechanistic Insights into Distilled DeepSeek R1 Models
by: Zhang, Jue, et al.
Published: (2025) -
COIN: Chance-Constrained Imitation Learning for Uncertainty-aware Adaptive Resource Oversubscription Policy
by: Wang, Lu, et al.
Published: (2024)