Saved in:
| Main Authors: | Lian, Shuquan, Liu, Juncheng, Chen, Yazhe, Chen, Yuhong, Li, Hui |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.11716 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding
by: Lian, Shuquan, et al.
Published: (2025)
by: Lian, Shuquan, et al.
Published: (2025)
SWE-smith: Scaling Data for Software Engineering Agents
by: Yang, John, et al.
Published: (2025)
by: Yang, John, et al.
Published: (2025)
AutoContext: Instance-Level Context Learning for LLM Agents
by: Cai, Kuntai, et al.
Published: (2025)
by: Cai, Kuntai, et al.
Published: (2025)
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
by: Zeng, Liang, et al.
Published: (2025)
by: Zeng, Liang, et al.
Published: (2025)
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
by: Shetty, Manish, et al.
Published: (2025)
by: Shetty, Manish, et al.
Published: (2025)
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering
by: Zeng, Guangtao, et al.
Published: (2025)
by: Zeng, Guangtao, et al.
Published: (2025)
SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents
by: Zhang, Zhirui, et al.
Published: (2026)
by: Zhang, Zhirui, et al.
Published: (2026)
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
by: Wei, Yuxiang, et al.
Published: (2025)
by: Wei, Yuxiang, et al.
Published: (2025)
Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents
by: Yang, Zonghan, et al.
Published: (2025)
by: Yang, Zonghan, et al.
Published: (2025)
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
by: Chen, Jialong, et al.
Published: (2026)
by: Chen, Jialong, et al.
Published: (2026)
Toward Training Superintelligent Software Agents through Self-Play SWE-RL
by: Wei, Yuxiang, et al.
Published: (2025)
by: Wei, Yuxiang, et al.
Published: (2025)
Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
by: Xia, Chunqiu Steven, et al.
Published: (2025)
by: Xia, Chunqiu Steven, et al.
Published: (2025)
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
by: Yang, John, et al.
Published: (2024)
by: Yang, John, et al.
Published: (2024)
SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks
by: Adamenko, Pavel, et al.
Published: (2025)
by: Adamenko, Pavel, et al.
Published: (2025)
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
by: Zan, Daoguang, et al.
Published: (2025)
by: Zan, Daoguang, et al.
Published: (2025)
m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models
by: Huang, Xiaoke, et al.
Published: (2025)
by: Huang, Xiaoke, et al.
Published: (2025)
SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories
by: Wang, Lilin, et al.
Published: (2025)
by: Wang, Lilin, et al.
Published: (2025)
Fuzzy Reasoning Chain (FRC): An Innovative Reasoning Framework from Fuzziness to Clarity
by: Chen, Ping, et al.
Published: (2025)
by: Chen, Ping, et al.
Published: (2025)
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
by: Yang, John, et al.
Published: (2024)
by: Yang, John, et al.
Published: (2024)
Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning
by: Wang, Jingjing, et al.
Published: (2026)
by: Wang, Jingjing, et al.
Published: (2026)
APEX-SWE
by: Kottamasu, Abhi, et al.
Published: (2026)
by: Kottamasu, Abhi, et al.
Published: (2026)
SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades
by: Lam, Man Ho, et al.
Published: (2026)
by: Lam, Man Ho, et al.
Published: (2026)
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents
by: Shi, Yaorui, et al.
Published: (2025)
by: Shi, Yaorui, et al.
Published: (2025)
UltraLogic: Enhancing LLM Reasoning through Large-Scale Data Synthesis and Bipolar Float Reward
by: Liu, Yile, et al.
Published: (2026)
by: Liu, Yile, et al.
Published: (2026)
RISK: A Framework for GUI Agents in E-commerce Risk Management
by: Chen, Renqi, et al.
Published: (2025)
by: Chen, Renqi, et al.
Published: (2025)
AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents
by: Feng, Zhaopeng, et al.
Published: (2026)
by: Feng, Zhaopeng, et al.
Published: (2026)
HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Optimization
by: Chen, Yurun, et al.
Published: (2025)
by: Chen, Yurun, et al.
Published: (2025)
Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning
by: Chen, Zhuoen, et al.
Published: (2026)
by: Chen, Zhuoen, et al.
Published: (2026)
Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models
by: Yu, Zhiwei, et al.
Published: (2025)
by: Yu, Zhiwei, et al.
Published: (2025)
Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning
by: Yu, Peiying, et al.
Published: (2025)
by: Yu, Peiying, et al.
Published: (2025)
SWE-bench Goes Live!
by: Zhang, Linghao, et al.
Published: (2025)
by: Zhang, Linghao, et al.
Published: (2025)
COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context
by: Wan, Guangya, et al.
Published: (2025)
by: Wan, Guangya, et al.
Published: (2025)
Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System
by: Wang, Haotian, et al.
Published: (2023)
by: Wang, Haotian, et al.
Published: (2023)
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
by: Hu, Zhiyuan, et al.
Published: (2026)
by: Hu, Zhiyuan, et al.
Published: (2026)
Anti-Length Shift: Dynamic Outlier Truncation for Training Efficient Reasoning Models
by: Wu, Wei, et al.
Published: (2026)
by: Wu, Wei, et al.
Published: (2026)
Cite Before You Speak: Enhancing Context-Response Grounding in E-commerce Conversational LLM-Agents
by: Zeng, Jingying, et al.
Published: (2025)
by: Zeng, Jingying, et al.
Published: (2025)
AgentFold: Long-Horizon Web Agents with Proactive Context Management
by: Ye, Rui, et al.
Published: (2025)
by: Ye, Rui, et al.
Published: (2025)
Dynamic Thinking-Token Selection for Efficient Reasoning in Large Reasoning Models
by: Guo, Zhenyuan, et al.
Published: (2026)
by: Guo, Zhenyuan, et al.
Published: (2026)
Agents in Software Engineering: Survey, Landscape, and Vision
by: Wang, Yanlin, et al.
Published: (2024)
by: Wang, Yanlin, et al.
Published: (2024)
Similar Items
-
UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding
by: Lian, Shuquan, et al.
Published: (2025) -
SWE-smith: Scaling Data for Software Engineering Agents
by: Yang, John, et al.
Published: (2025) -
AutoContext: Instance-Level Context Learning for LLM Agents
by: Cai, Kuntai, et al.
Published: (2025) -
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
by: Zeng, Liang, et al.
Published: (2025) -
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
by: Shetty, Manish, et al.
Published: (2025)