:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Xichen, Wu, Sitong, Tan, Haoru, Yu, Shaozuo, Zhu, Yinghao, He, Ziyi, Jia, Jiaya
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2510.19767
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning
by: Zhang, Xichen, et al.
Published: (2025)

SearchGym: Bootstrapping Real-World Search Agents via Cost-Effective and High-Fidelity Environment Simulation
by: Zhang, Xichen, et al.
Published: (2026)

TraveLLaMA: A Multimodal Travel Assistant with Large-Scale Dataset and Structured Reasoning
by: Chu, Meng, et al.
Published: (2025)

MOODv2: Masked Image Modeling for Out-of-Distribution Detection
by: Li, Jingyao, et al.
Published: (2024)

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
by: Wang, Yue, et al.
Published: (2025)

VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis
by: Chu, Meng, et al.
Published: (2025)

Data Pruning by Information Maximization
by: Tan, Haoru, et al.
Published: (2025)

Watch Wider and Think Deeper: Collaborative Cross-modal Chain-of-Thought for Complex Visual Reasoning
by: Lu, Wenting, et al.
Published: (2026)

DreamOmni2: Multimodal Instruction-based Editing and Generation
by: Xia, Bin, et al.
Published: (2025)

Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
by: Su, Jinyan, et al.
Published: (2025)

Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage
by: He, Ziyi, et al.
Published: (2026)

RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models
by: Li, Jingyao, et al.
Published: (2024)

Unlocking Exploration in RLVR: Uncertainty-aware Advantage Shaping for Deeper Reasoning
by: Xie, Can, et al.
Published: (2025)

EFRame: Deeper Reasoning via Exploration-Filter-Replay Reinforcement Learning Framework
by: Wang, Chen, et al.
Published: (2025)

Ensemble Quadratic Assignment Network for Graph Matching
by: Tan, Haoru, et al.
Published: (2024)

OptimalThinkingBench: Evaluating Over and Underthinking in LLMs
by: Aggarwal, Pranjal, et al.
Published: (2025)

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
by: Zhong, Zhisheng, et al.
Published: (2024)

LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
by: Hou, Bo, et al.
Published: (2025)

Logits-Based Finetuning
by: Li, Jingyao, et al.
Published: (2025)

MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks
by: Zhu, Yinghao, et al.
Published: (2025)

Efficient Long-Context Modeling in Diffusion Language Models via Block Approximate Sparse Attention
by: Zhang, Wenhu, et al.
Published: (2026)

VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning
by: Yang, Senqiao, et al.
Published: (2025)

Avoiding Overthinking and Underthinking: Curriculum-Aware Budget Scheduling for LLMs
by: Rahman, Amirul, et al.
Published: (2026)

ThoughtProbe: Classifier-Guided Thought Space Exploration Leveraging LLM Intrinsic Reasoning
by: Wang, Zijian, et al.
Published: (2025)

MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks
by: Li, Jingyao, et al.
Published: (2023)

Understanding Data Influence with Differential Approximation
by: Tan, Haoru, et al.
Published: (2025)

Evolving Deeper LLM Thinking
by: Lee, Kuang-Huei, et al.
Published: (2025)

Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models
by: Peng, Ruiying, et al.
Published: (2026)

A Comprehensive Survey of the Lean 4 Theorem Prover: Architecture, Applications, and Advances
by: Tang, Xichen
Published: (2025)

Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents
by: Zhu, Mingkang, et al.
Published: (2025)

Repulsor: Accelerating Generative Modeling with a Contrastive Memory Bank
by: Zhang, Shaofeng, et al.
Published: (2025)

From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models
by: Zhu, Mingkang, et al.
Published: (2025)

GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
by: Mou, Zhun, et al.
Published: (2025)

Efficient Reasoning via Thought-Training and Thought-Free Inference
by: Wu, Canhui, et al.
Published: (2025)

HealthFlow: A Self-Evolving AI Agent with Meta Planning for Autonomous Healthcare Research
by: Zhu, Yinghao, et al.
Published: (2025)

VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
by: Liu, Yuqi, et al.
Published: (2025)

Deeper with Riemannian Geometry: Overcoming Oversmoothing and Oversquashing for Graph Foundation Models
by: Sun, Li, et al.
Published: (2025)

Enhancing LLM Knowledge Learning through Generalization
by: Zhu, Mingkang, et al.
Published: (2025)

Refine Thought: A Test-Time Inference Method for Embedding Model Reasoning
by: Wang, Guangzhi, et al.
Published: (2025)

ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations
by: Wang, Zijian, et al.
Published: (2025)