Enregistré dans:
| Auteurs principaux: | Xu, Xiaoyu, Pan, Yulan, Yuan, Xiaosong, Shen, Zhihong, Su, Minghao, Su, Yuanhao, Zhang, Xiaofeng |
|---|---|
| Format: | Preprint |
| Publié: |
2026
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2604.06695 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
On the Step Length Confounding in LLM Reasoning Data Selection
par: Wang, Bing, et autres
Publié: (2026)
par: Wang, Bing, et autres
Publié: (2026)
From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks
par: Zhang, Xiaofeng, et autres
Publié: (2024)
par: Zhang, Xiaofeng, et autres
Publié: (2024)
Where Does Reasoning Break? Step-Level Hallucination Detection via Hidden-State Transport Geometry
par: Alvarez, Tyler, et autres
Publié: (2026)
par: Alvarez, Tyler, et autres
Publié: (2026)
LAMMI-Pathology: A Tool-Centric Bottom-Up LVLM-Agent Framework for Molecularly Informed Medical Intelligence in Pathology
par: Su, Haoyang, et autres
Publié: (2026)
par: Su, Haoyang, et autres
Publié: (2026)
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
par: Xin, Yue, et autres
Publié: (2025)
par: Xin, Yue, et autres
Publié: (2025)
Where LLM Agents Fail and How They can Learn From Failures
par: Zhu, Kunlun, et autres
Publié: (2025)
par: Zhu, Kunlun, et autres
Publié: (2025)
Efficient Reasoning Through Suppression of Self-Affirmation Reflections in Large Reasoning Models
par: Liu, Kaiyuan, et autres
Publié: (2025)
par: Liu, Kaiyuan, et autres
Publié: (2025)
FinSheet-Bench: From Simple Lookups to Complex Reasoning, Where LLMs Break on Financial Spreadsheets
par: Ravnik, Jan, et autres
Publié: (2026)
par: Ravnik, Jan, et autres
Publié: (2026)
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?
par: Tyagi, Nemika, et autres
Publié: (2024)
par: Tyagi, Nemika, et autres
Publié: (2024)
ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints
par: Xu, Rui, et autres
Publié: (2025)
par: Xu, Rui, et autres
Publié: (2025)
Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach
par: Fan, Sinan, et autres
Publié: (2025)
par: Fan, Sinan, et autres
Publié: (2025)
CardioCoT: Hierarchical Reasoning for Multimodal Survival Analysis
par: Rui, Shaohao, et autres
Publié: (2025)
par: Rui, Shaohao, et autres
Publié: (2025)
Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions
par: Rakshit, Sushrita, et autres
Publié: (2026)
par: Rakshit, Sushrita, et autres
Publié: (2026)
Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue
par: Du, Huifang, et autres
Publié: (2024)
par: Du, Huifang, et autres
Publié: (2024)
Tracking the Limits of Knowledge Propagation: How LLMs Fail at Multi-Step Reasoning with Conflicting Knowledge
par: Feng, Yiyang, et autres
Publié: (2026)
par: Feng, Yiyang, et autres
Publié: (2026)
Where Do AI Coding Agents Fail? An Empirical Study of Failed Agentic Pull Requests in GitHub
par: Ehsani, Ramtin, et autres
Publié: (2026)
par: Ehsani, Ramtin, et autres
Publié: (2026)
Tools as Continuous Flow for Evolving Agentic Reasoning
par: Huang, Tairan, et autres
Publié: (2026)
par: Huang, Tairan, et autres
Publié: (2026)
TRACE: Distilling Where It Matters via Token-Routed Self On-Policy Alignment
par: Wang, Jiaxuan, et autres
Publié: (2026)
par: Wang, Jiaxuan, et autres
Publié: (2026)
When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning
par: Mao, Ruixiang, et autres
Publié: (2026)
par: Mao, Ruixiang, et autres
Publié: (2026)
Deployability-Centric Infrastructure-as-Code Generation: Fail, Learn, Refine, and Succeed through LLM-Empowered DevOps Simulation
par: Zhang, Tianyi, et autres
Publié: (2025)
par: Zhang, Tianyi, et autres
Publié: (2025)
Beyond Meta-Reasoning: Metacognitive Consolidation for Self-Improving LLM Reasoning
par: Zhuang, Ziqing, et autres
Publié: (2026)
par: Zhuang, Ziqing, et autres
Publié: (2026)
Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions
par: Floro, Avrile, et autres
Publié: (2026)
par: Floro, Avrile, et autres
Publié: (2026)
ST-Mamba: Spatial-Temporal Mamba for Traffic Flow Estimation Recovery using Limited Data
par: Yuan, Doncheng, et autres
Publié: (2024)
par: Yuan, Doncheng, et autres
Publié: (2024)
Where Paths Split: Localized, Calibrated Control of Moral Reasoning in Large Language Models
par: Yuan, Chenchen, et autres
Publié: (2026)
par: Yuan, Chenchen, et autres
Publié: (2026)
The Long-Horizon Task Mirage? Diagnosing Where and Why Agentic Systems Break
par: Wang, Xinyu Jessica, et autres
Publié: (2026)
par: Wang, Xinyu Jessica, et autres
Publié: (2026)
AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions
par: Kirichenko, Polina, et autres
Publié: (2025)
par: Kirichenko, Polina, et autres
Publié: (2025)
Boosting Deductive Reasoning with Step Signals In RLHF
par: Li, Jialian, et autres
Publié: (2024)
par: Li, Jialian, et autres
Publié: (2024)
LegalGraphRAG: Multi-Agent Graph Retrieval-Augmented Generation for Reliable Legal Reasoning
par: Chen, Zerui, et autres
Publié: (2026)
par: Chen, Zerui, et autres
Publié: (2026)
ReasonIF: Large Reasoning Models Fail to Follow Instructions During Reasoning
par: Kwon, Yongchan, et autres
Publié: (2025)
par: Kwon, Yongchan, et autres
Publié: (2025)
Do Latent-CoT Models Think Step-by-Step? A Mechanistic Study on Sequential Reasoning Tasks
par: Liang, Jia, et autres
Publié: (2026)
par: Liang, Jia, et autres
Publié: (2026)
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
par: Yan, Yuchen, et autres
Publié: (2025)
par: Yan, Yuchen, et autres
Publié: (2025)
Stop Before You Fail: Operational Capability Boundaries for Mitigating Unproductive Reasoning in Large Reasoning Models
par: Zhang, Qingjie, et autres
Publié: (2025)
par: Zhang, Qingjie, et autres
Publié: (2025)
LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning
par: Shi, Weijie, et autres
Publié: (2025)
par: Shi, Weijie, et autres
Publié: (2025)
MedRule-KG: A Knowledge-Graph--Steered Scaffold for Reliable Mathematical and Biomedical Reasoning
par: Su, Crystal
Publié: (2025)
par: Su, Crystal
Publié: (2025)
Coarse-to-Fine Process Reward Modeling for Mathematical Reasoning
par: Hu, Yulan, et autres
Publié: (2025)
par: Hu, Yulan, et autres
Publié: (2025)
Human-Inspired Continuous Learning of Internal Reasoning Processes: Learning How to Think for Adaptive AI Systems
par: Su, Hong
Publié: (2026)
par: Su, Hong
Publié: (2026)
MedRule-KG: A Knowledge-Graph--Steered Scaffold for Mathematical Reasoning with a Lightweight Verifier
par: Su, Crystal
Publié: (2025)
par: Su, Crystal
Publié: (2025)
Reinforcement Learning Enhanced Multi-hop Reasoning for Temporal Knowledge Question Answering
par: Wen, Wuzhenghong, et autres
Publié: (2026)
par: Wen, Wuzhenghong, et autres
Publié: (2026)
Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning
par: Rui, Shaohao, et autres
Publié: (2025)
par: Rui, Shaohao, et autres
Publié: (2025)
Backtracking When It Strays: Mitigating Dual Exposure Biases in LLM Reasoning Distillation
par: Wang, Bing, et autres
Publié: (2026)
par: Wang, Bing, et autres
Publié: (2026)
Documents similaires
-
On the Step Length Confounding in LLM Reasoning Data Selection
par: Wang, Bing, et autres
Publié: (2026) -
From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks
par: Zhang, Xiaofeng, et autres
Publié: (2024) -
Where Does Reasoning Break? Step-Level Hallucination Detection via Hidden-State Transport Geometry
par: Alvarez, Tyler, et autres
Publié: (2026) -
LAMMI-Pathology: A Tool-Centric Bottom-Up LVLM-Agent Framework for Molecularly Informed Medical Intelligence in Pathology
par: Su, Haoyang, et autres
Publié: (2026) -
SalaMAnder: Shapley-based Mathematical Expression Attribution and Metric for Chain-of-Thought Reasoning
par: Xin, Yue, et autres
Publié: (2025)