Saved in:
| Main Authors: | Badertdinov, Ibragim, Nekrashevich, Maksim, Shevtsov, Anton, Golubev, Alexander |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.23866 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
by: Badertdinov, Ibragim, et al.
Published: (2025)
by: Badertdinov, Ibragim, et al.
Published: (2025)
Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents
by: Zainullina, Karina, et al.
Published: (2025)
by: Zainullina, Karina, et al.
Published: (2025)
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
by: Golubev, Alexander, et al.
Published: (2025)
by: Golubev, Alexander, et al.
Published: (2025)
APEX-SWE
by: Kottamasu, Abhi, et al.
Published: (2026)
by: Kottamasu, Abhi, et al.
Published: (2026)
SWE-smith: Scaling Data for Software Engineering Agents
by: Yang, John, et al.
Published: (2025)
by: Yang, John, et al.
Published: (2025)
From SWE-ZERO to SWE-HERO: Execution-free to Execution-based Fine-tuning for Software Engineering Agents
by: Ludwig, Nikolai, et al.
Published: (2026)
by: Ludwig, Nikolai, et al.
Published: (2026)
SWE-bench Goes Live!
by: Zhang, Linghao, et al.
Published: (2025)
by: Zhang, Linghao, et al.
Published: (2025)
daVinci-Env: Open SWE Environment Synthesis at Scale
by: Fu, Dayuan, et al.
Published: (2026)
by: Fu, Dayuan, et al.
Published: (2026)
Training Software Engineering Agents and Verifiers with SWE-Gym
by: Pan, Jiayi, et al.
Published: (2024)
by: Pan, Jiayi, et al.
Published: (2024)
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
by: Deng, Xiang, et al.
Published: (2025)
by: Deng, Xiang, et al.
Published: (2025)
SWE-QA: Can Language Models Answer Repository-level Code Questions?
by: Peng, Weihan, et al.
Published: (2025)
by: Peng, Weihan, et al.
Published: (2025)
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents
by: Wang, Yuhang, et al.
Published: (2026)
by: Wang, Yuhang, et al.
Published: (2026)
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
by: Shetty, Manish, et al.
Published: (2025)
by: Shetty, Manish, et al.
Published: (2025)
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks
by: Adamenko, Pavel, et al.
Published: (2025)
by: Adamenko, Pavel, et al.
Published: (2025)
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
by: Jimenez, Carlos E., et al.
Published: (2023)
by: Jimenez, Carlos E., et al.
Published: (2023)
SWE-World: Building Software Engineering Agents in Docker-Free Environments
by: Sun, Shuang, et al.
Published: (2026)
by: Sun, Shuang, et al.
Published: (2026)
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
by: Du, Yaxin, et al.
Published: (2025)
by: Du, Yaxin, et al.
Published: (2025)
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering
by: Zeng, Guangtao, et al.
Published: (2025)
by: Zeng, Guangtao, et al.
Published: (2025)
SWE-Exp: Experience-Driven Software Issue Resolution
by: Chen, Silin, et al.
Published: (2025)
by: Chen, Silin, et al.
Published: (2025)
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving
by: Tao, Chaofan, et al.
Published: (2026)
by: Tao, Chaofan, et al.
Published: (2026)
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training
by: Song, Huatong, et al.
Published: (2026)
by: Song, Huatong, et al.
Published: (2026)
ORACLE-SWE: Quantifying the Contribution of Oracle Information Signals on SWE Agents
by: Li, Kenan, et al.
Published: (2026)
by: Li, Kenan, et al.
Published: (2026)
R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents
by: Jain, Naman, et al.
Published: (2025)
by: Jain, Naman, et al.
Published: (2025)
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
by: Zan, Daoguang, et al.
Published: (2025)
by: Zan, Daoguang, et al.
Published: (2025)
Kimi-Dev: Agentless Training as Skill Prior for SWE-Agents
by: Yang, Zonghan, et al.
Published: (2025)
by: Yang, Zonghan, et al.
Published: (2025)
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?
by: Chen, Guoxin, et al.
Published: (2026)
by: Chen, Guoxin, et al.
Published: (2026)
BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills
by: Sonwane, Atharv, et al.
Published: (2025)
by: Sonwane, Atharv, et al.
Published: (2025)
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
by: Li, Han, et al.
Published: (2025)
by: Li, Han, et al.
Published: (2025)
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
by: Yu, Boxi, et al.
Published: (2025)
by: Yu, Boxi, et al.
Published: (2025)
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
by: Chen, Jialong, et al.
Published: (2026)
by: Chen, Jialong, et al.
Published: (2026)
SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades
by: Lam, Man Ho, et al.
Published: (2026)
by: Lam, Man Ho, et al.
Published: (2026)
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
by: Zan, Daoguang, et al.
Published: (2024)
by: Zan, Daoguang, et al.
Published: (2024)
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
by: Yang, John, et al.
Published: (2024)
by: Yang, John, et al.
Published: (2024)
SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
by: Wei, Yuxiang, et al.
Published: (2025)
by: Wei, Yuxiang, et al.
Published: (2025)
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
by: Yang, John, et al.
Published: (2024)
by: Yang, John, et al.
Published: (2024)
SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning
by: Peng, Jinjun, et al.
Published: (2026)
by: Peng, Jinjun, et al.
Published: (2026)
SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution
by: He, Kang, et al.
Published: (2026)
by: He, Kang, et al.
Published: (2026)
SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents
by: Zhang, Zhirui, et al.
Published: (2026)
by: Zhang, Zhirui, et al.
Published: (2026)
HE-SNR: Uncovering Latent Logic via Entropy for Guiding Mid-Training on SWE-bench
by: Wang, Yueyang, et al.
Published: (2026)
by: Wang, Yueyang, et al.
Published: (2026)
Similar Items
-
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
by: Badertdinov, Ibragim, et al.
Published: (2025) -
Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents
by: Zainullina, Karina, et al.
Published: (2025) -
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning
by: Golubev, Alexander, et al.
Published: (2025) -
APEX-SWE
by: Kottamasu, Abhi, et al.
Published: (2026) -
SWE-smith: Scaling Data for Software Engineering Agents
by: Yang, John, et al.
Published: (2025)