Saved in:
| Main Authors: | Li, Yanming, Zhang, Xuelin, Lu, WenJie, Tang, Ziye, Wu, Maodong, Luo, Haotian, Wu, Tongtong, Peng, Zijie, Mi, Hongze, Feng, Yibo, Tan, Naiqiang, Huang, Chao, Chen, Hong, Shen, Li |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08335 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution
by: Mi, Hongze, et al.
Published: (2026)
by: Mi, Hongze, et al.
Published: (2026)
D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents
by: Mi, Hongze, et al.
Published: (2025)
by: Mi, Hongze, et al.
Published: (2025)
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
by: Luo, Haotian, et al.
Published: (2025)
by: Luo, Haotian, et al.
Published: (2025)
Who Deserves the Credit for Lower Unemployment? Structural Monetary Policy Tools and Corporate Labour Employment in China
by: Xue Li, et al.
Published: (2025)
by: Xue Li, et al.
Published: (2025)
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
by: Luo, Haotian, et al.
Published: (2025)
by: Luo, Haotian, et al.
Published: (2025)
Comparative Outcomes of Different Surgical Approaches for Non‐Lactational Mastitis With Posterior and Non‐Posterior Space Abscesses: A Retrospective Cohort Study
by: WenJie Zhang, et al.
Published: (2025)
by: WenJie Zhang, et al.
Published: (2025)
Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
by: Luo, Haotian, et al.
Published: (2025)
by: Luo, Haotian, et al.
Published: (2025)
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
by: Wang, Yibo, et al.
Published: (2025)
by: Wang, Yibo, et al.
Published: (2025)
CARD: Towards Conditional Design of Multi-agent Topological Structures
by: Wu, Tongtong, et al.
Published: (2026)
by: Wu, Tongtong, et al.
Published: (2026)
Agent-Omit: Adaptive Context Omission for Efficient LLM Agents
by: Ning, Yansong, et al.
Published: (2026)
by: Ning, Yansong, et al.
Published: (2026)
Shapley-Coop: Credit Assignment for Emergent Cooperation in Self-Interested LLM Agents
by: Hua, Yun, et al.
Published: (2025)
by: Hua, Yun, et al.
Published: (2025)
What Deserves Memory: Adaptive Memory Distillation for LLM Agents
by: Ma, Wenquan, et al.
Published: (2025)
by: Ma, Wenquan, et al.
Published: (2025)
R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search
by: Wang, Yibo, et al.
Published: (2025)
by: Wang, Yibo, et al.
Published: (2025)
Bag of Tricks for Inference-time Computation of LLM Reasoning
by: Liu, Fan, et al.
Published: (2025)
by: Liu, Fan, et al.
Published: (2025)
Improving DAPO from a Mixed-Policy Perspective
by: Tan, Hongze, et al.
Published: (2025)
by: Tan, Hongze, et al.
Published: (2025)
Retrieval, Reward, and Training Protocols: What Matters in Training Search Agents?
by: Zhao, Yibo, et al.
Published: (2026)
by: Zhao, Yibo, et al.
Published: (2026)
Which Contributions Deserve Credit? Perceptions of Attribution in Human-AI Co-Creation
by: He, Jessica, et al.
Published: (2025)
by: He, Jessica, et al.
Published: (2025)
FusionRegister: Every Infrared and Visible Image Fusion Deserves Registration
by: Bian, Congcong, et al.
Published: (2026)
by: Bian, Congcong, et al.
Published: (2026)
Prompt-Driven Low-Altitude Edge Intelligence: Modular Agents and Generative Reasoning
by: You, Jiahao, et al.
Published: (2026)
by: You, Jiahao, et al.
Published: (2026)
Enhancing Target-Guided Proactive Dialogue Systems via Conversational Scenario Modeling and Intent-Keyword Bridging
by: Li, Maodong, et al.
Published: (2026)
by: Li, Maodong, et al.
Published: (2026)
Who Deserves to Stay? Latent Profiles of Public Perceptions of Migrant Deservingness in Turkey
by: Dilara Turgut, et al.
Published: (2025)
by: Dilara Turgut, et al.
Published: (2025)
Multi-Hop Question Generation via Dual-Perspective Keyword Guidance
by: Li, Maodong, et al.
Published: (2025)
by: Li, Maodong, et al.
Published: (2025)
SCAR: Shapley Credit Assignment for More Efficient RLHF
by: Cao, Meng, et al.
Published: (2025)
by: Cao, Meng, et al.
Published: (2025)
Hindsight Credit Assignment for Long-Horizon LLM Agents
by: Tan, Hui-Ze, et al.
Published: (2026)
by: Tan, Hui-Ze, et al.
Published: (2026)
Rewarding Beliefs, Not Actions: Consistency-Guided Credit Assignment for Long-Horizon Agents
by: Tang, Wenjie, et al.
Published: (2026)
by: Tang, Wenjie, et al.
Published: (2026)
Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
SHARP: A Self-Evolving Human-Auditable Rubric Policy for Financial Trading Agents
by: Chen, Xiwen, et al.
Published: (2026)
by: Chen, Xiwen, et al.
Published: (2026)
A Historical Interaction-Enhanced Shapley Policy Gradient Algorithm for Multi-Agent Credit Assignment
by: Ding, Ao, et al.
Published: (2025)
by: Ding, Ao, et al.
Published: (2025)
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning
by: Cheng, Jie, et al.
Published: (2025)
by: Cheng, Jie, et al.
Published: (2025)
Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning
by: Ning, Yansong, et al.
Published: (2025)
by: Ning, Yansong, et al.
Published: (2025)
SHARP: Experiences in Library Automation
by: Smith, Ruth Camp
Published: (1974)
by: Smith, Ruth Camp
Published: (1974)
Deserving the Option to Give
by: Jeffrey Moriarty
Published: (2024)
by: Jeffrey Moriarty
Published: (2024)
Random ISAC Signals Deserve Dedicated Precoding
by: Lu, Shihang, et al.
Published: (2023)
by: Lu, Shihang, et al.
Published: (2023)
Your Demands Deserve More Bits: Referring Semantic Image Compression at Ultra-low Bitrate
by: Wu, Chenhao, et al.
Published: (2025)
by: Wu, Chenhao, et al.
Published: (2025)
Who Deserves Scarce Health and Education Resources? How Policy Context Shapes Target Group Deservingness
by: Elizabeth Bell, et al.
Published: (2025)
by: Elizabeth Bell, et al.
Published: (2025)
Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward
by: Wan, Yanming, et al.
Published: (2025)
by: Wan, Yanming, et al.
Published: (2025)
GTPO and GRPO-S: Token and Sequence-Level Reward Shaping with Policy Entropy
by: Tan, Hongze, et al.
Published: (2025)
by: Tan, Hongze, et al.
Published: (2025)
Pseudo-Siamese Network for Planning in Target-Oriented Proactive Dialogues
by: Kang, Xinyue, et al.
Published: (2026)
by: Kang, Xinyue, et al.
Published: (2026)
After Returning to the Rural: The (Un)Sustainable Reintegration of Internal Migrant Workers in China
by: Mengyao Cheng, et al.
Published: (2025)
by: Mengyao Cheng, et al.
Published: (2025)
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
by: Zhang, Kaiyi, et al.
Published: (2026)
by: Zhang, Kaiyi, et al.
Published: (2026)
Similar Items
-
Darwinian Memory: A Training-Free Self-Regulating Memory System for GUI Agent Evolution
by: Mi, Hongze, et al.
Published: (2026) -
D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents
by: Mi, Hongze, et al.
Published: (2025) -
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
by: Luo, Haotian, et al.
Published: (2025) -
Who Deserves the Credit for Lower Unemployment? Structural Monetary Policy Tools and Corporate Labour Employment in China
by: Xue Li, et al.
Published: (2025) -
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
by: Luo, Haotian, et al.
Published: (2025)