Saved in:
| Main Authors: | Kapoor, Vansh, Gupta, Aman, Chen, Hao, Beniwal, Anurag, Huang, Jing, Kumar, Aviral |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.10245 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multilingual Information Retrieval with a Monolingual Knowledge Base
by: Zhuang, Yingying, et al.
Published: (2025)
by: Zhuang, Yingying, et al.
Published: (2025)
DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems
by: Gupta, Aman, et al.
Published: (2024)
by: Gupta, Aman, et al.
Published: (2024)
How and Where to Translate? The Impact of Translation Strategies in Cross-lingual LLM Prompting
by: Gupta, Aman, et al.
Published: (2025)
by: Gupta, Aman, et al.
Published: (2025)
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning
by: Singh, Ayush, et al.
Published: (2024)
by: Singh, Ayush, et al.
Published: (2024)
StepWiser: Stepwise Generative Judges for Wiser Reasoning
by: Xiong, Wei, et al.
Published: (2025)
by: Xiong, Wei, et al.
Published: (2025)
StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason
by: Zhang, Kaiyi, et al.
Published: (2025)
by: Zhang, Kaiyi, et al.
Published: (2025)
Self-Evaluating LLMs for Multi-Step Tasks: Stepwise Confidence Estimation for Failure Detection
by: Mavi, Vaibhav, et al.
Published: (2025)
by: Mavi, Vaibhav, et al.
Published: (2025)
Where Does Toxicity Live? Mechanistic Localization and Targeted Suppression in Language Models
by: Beniwal, Himanshu, et al.
Published: (2026)
by: Beniwal, Himanshu, et al.
Published: (2026)
Rubric-Guided Process Reward for Stepwise Model Routing
by: Ye, Shenghao, et al.
Published: (2026)
by: Ye, Shenghao, et al.
Published: (2026)
SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking
by: Huang, Weiyang, et al.
Published: (2026)
by: Huang, Weiyang, et al.
Published: (2026)
POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration
by: Qu, Yuxiao, et al.
Published: (2026)
by: Qu, Yuxiao, et al.
Published: (2026)
Leveraging the Power of Large Language Models in Entity Linking via Adaptive Routing and Targeted Reasoning
by: Li, Yajie, et al.
Published: (2025)
by: Li, Yajie, et al.
Published: (2025)
SAFE: Stepwise Atomic Feedback for Error correction in Multi-hop Reasoning
by: Kwon, Daeyong, et al.
Published: (2026)
by: Kwon, Daeyong, et al.
Published: (2026)
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
by: Zhou, Yifei, et al.
Published: (2024)
by: Zhou, Yifei, et al.
Published: (2024)
RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks
by: Wu, Mian, et al.
Published: (2025)
by: Wu, Mian, et al.
Published: (2025)
InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning
by: Yang, Matthew Y. R., et al.
Published: (2026)
by: Yang, Matthew Y. R., et al.
Published: (2026)
BEACON: Balancing Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes
by: Nagpal, Vansh, et al.
Published: (2024)
by: Nagpal, Vansh, et al.
Published: (2024)
StepCodeReasoner: Aligning Code Reasoning with Stepwise Execution Traces via Reinforcement Learning
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
Dynamic Reasoning Chains through Depth-Specialized Mixture-of-Experts in Transformer Architectures
by: Roy, Sampurna, et al.
Published: (2025)
by: Roy, Sampurna, et al.
Published: (2025)
When Less is Enough: Efficient Inference via Collaborative Reasoning
by: Chen, Yilei, et al.
Published: (2026)
by: Chen, Yilei, et al.
Published: (2026)
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
by: Cui, Yingqian, et al.
Published: (2025)
by: Cui, Yingqian, et al.
Published: (2025)
Mechanistic Interpretability of GPT-like Models on Summarization Tasks
by: Mishra, Anurag
Published: (2025)
by: Mishra, Anurag
Published: (2025)
ReasonBENCH: Benchmarking the (In)Stability of LLM Reasoning
by: Potamitis, Nearchos, et al.
Published: (2025)
by: Potamitis, Nearchos, et al.
Published: (2025)
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
by: Li, Zhuoqun, et al.
Published: (2024)
by: Li, Zhuoqun, et al.
Published: (2024)
Multilingual Performance Biases of Large Language Models in Education
by: Gupta, Vansh, et al.
Published: (2025)
by: Gupta, Vansh, et al.
Published: (2025)
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
by: Chen, Peter, et al.
Published: (2025)
by: Chen, Peter, et al.
Published: (2025)
Offline Reinforcement Learning for LLM Multi-Step Reasoning
by: Wang, Huaijie, et al.
Published: (2024)
by: Wang, Huaijie, et al.
Published: (2024)
nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning
by: Luo, Tianqi, et al.
Published: (2025)
by: Luo, Tianqi, et al.
Published: (2025)
REIC: RAG-Enhanced Intent Classification at Scale
by: Zhang, Ziji, et al.
Published: (2025)
by: Zhang, Ziji, et al.
Published: (2025)
Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages
by: Kunde, Vishnu Teja, et al.
Published: (2026)
by: Kunde, Vishnu Teja, et al.
Published: (2026)
MEDEQUALQA: Evaluating Biases in LLMs with Counterfactual Reasoning
by: Ghosh, Rajarshi, et al.
Published: (2025)
by: Ghosh, Rajarshi, et al.
Published: (2025)
Hierarchical Resolution Transformers: A Wavelet-Inspired Architecture for Multi-Scale Language Understanding
by: Sar, Ayan, et al.
Published: (2025)
by: Sar, Ayan, et al.
Published: (2025)
Discovering Process-Outcome Credit in Multi-Step LLM Reasoning
by: Wang, Xiangwei, et al.
Published: (2026)
by: Wang, Xiangwei, et al.
Published: (2026)
Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning
by: Li, Xintong, et al.
Published: (2026)
by: Li, Xintong, et al.
Published: (2026)
IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
by: KJ, Sankalp, et al.
Published: (2025)
by: KJ, Sankalp, et al.
Published: (2025)
More Than a Quick Glance: Overcoming the Greedy Bias in KV-Cache Compression
by: Sood, Aryan, et al.
Published: (2026)
by: Sood, Aryan, et al.
Published: (2026)
COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing
by: Sheth, Rajvee, et al.
Published: (2025)
by: Sheth, Rajvee, et al.
Published: (2025)
PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs
by: Yadav, Ankit, et al.
Published: (2024)
by: Yadav, Ankit, et al.
Published: (2024)
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task
by: Yan, Yuchen, et al.
Published: (2025)
by: Yan, Yuchen, et al.
Published: (2025)
Towards Stepwise Domain Knowledge-Driven Reasoning Optimization and Reflection Improvement
by: Liu, Chengyuan, et al.
Published: (2025)
by: Liu, Chengyuan, et al.
Published: (2025)
Similar Items
-
Multilingual Information Retrieval with a Monolingual Knowledge Base
by: Zhuang, Yingying, et al.
Published: (2025) -
DARD: A Multi-Agent Approach for Task-Oriented Dialog Systems
by: Gupta, Aman, et al.
Published: (2024) -
How and Where to Translate? The Impact of Translation Strategies in Cross-lingual LLM Prompting
by: Gupta, Aman, et al.
Published: (2025) -
Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning
by: Singh, Ayush, et al.
Published: (2024) -
StepWiser: Stepwise Generative Judges for Wiser Reasoning
by: Xiong, Wei, et al.
Published: (2025)