:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, ShaoZhen, Huang, Xinting, Peng, Houwen, Chen, Xin, Song, Xinyang, Li, Qi, Sun, Zhenan
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.05616
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Self-Evolving Curriculum for LLM Reasoning
by: Chen, Xiaoyin, et al.
Published: (2025)

MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
by: Chen, Jinhao, et al.
Published: (2025)

Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning
by: Sun, Hai-Long, et al.
Published: (2025)

Empirical Study of Named Entity Recognition Performance Using Distribution-aware Word Embedding
by: Chen, Xin, et al.
Published: (2021)

R-Zero: Self-Evolving Reasoning LLM from Zero Data
by: Huang, Chengsong, et al.
Published: (2025)

Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
by: Singh, Joykirat, et al.
Published: (2025)

TRINITY: An Evolved LLM Coordinator
by: Xu, Jinglue, et al.
Published: (2025)

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
by: Liu, Jiaqi, et al.
Published: (2026)

Self-Consolidation for Self-Evolving Agents
by: Yu, Hongzhuo, et al.
Published: (2026)

Data Trajectory Alignment for LLM Domain Adaptation: A Two-Phase Synthesis Framework for Telecommunications Mathematics
by: Zhou, Zhicheng, et al.
Published: (2025)

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents
by: Feng, Xinshun, et al.
Published: (2026)

Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
by: Amiri, Alireza, et al.
Published: (2025)

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
by: Chen, Jiaqi, et al.
Published: (2025)

A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends
by: Gui, Jie, et al.
Published: (2023)

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
by: Xia, Peng, et al.
Published: (2025)

FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
by: Yu, Zhouliang, et al.
Published: (2025)

CurveRL: Principled Distribution-Aware Context Reweighting for LLM Reasoning
by: Sun, Ke, et al.
Published: (2026)

Rethinking Expert Trajectory Utilization in LLM Post-training for Mathematical Reasoning
by: Ding, Bowen, et al.
Published: (2025)

Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning
by: Huang, Xinting, et al.
Published: (2025)

LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection
by: Qu, Bo, et al.
Published: (2025)

DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning
by: Li, Chengpeng, et al.
Published: (2024)

Reinforcing Chain-of-Thought Reasoning with Self-Evolving Rubrics
by: Sheng, Leheng, et al.
Published: (2026)

Preventing Curriculum Collapse in Self-Evolving Reasoning Systems
by: Mishra, Vaibhav
Published: (2026)

Explainable LLM Unlearning Through Reasoning
by: Liao, Junfeng, et al.
Published: (2026)

Diving into Self-Evolving Training for Multimodal Reasoning
by: Liu, Wei, et al.
Published: (2024)

Auto-Evolve: Enhancing Large Language Model's Performance via Self-Reasoning Framework
by: Aswani, Krishna, et al.
Published: (2024)

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning
by: Wan, Zhongwei, et al.
Published: (2026)

SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards
by: Zhang, Dengjia, et al.
Published: (2026)

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning
by: Yang, Qi, et al.
Published: (2025)

Better LLM Reasoning via Dual-Play
by: Zhang, Zhengxin, et al.
Published: (2025)

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
by: Yu, Erxin, et al.
Published: (2025)

APEX: Autonomous Policy Exploration for Self-Evolving LLM Agents
by: Li, Yibo, et al.
Published: (2026)

Self-Evolving Critique Abilities in Large Language Models
by: Tang, Zhengyang, et al.
Published: (2025)

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design
by: Zheng, Zhen, et al.
Published: (2024)

SenseFlow: A Physics-Informed and Self-Ensembling Iterative Framework for Power Flow Estimation
by: Zhao, Zhen, et al.
Published: (2025)

Temporal Reasoning with Large Language Models Augmented by Evolving Knowledge Graphs
by: Lin, Junhong, et al.
Published: (2025)

Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability
by: Chung, Tsz Ting, et al.
Published: (2024)

Is One Score Enough? Rethinking the Evaluation of Sequentially Evolving LLM Memory
by: Dong, Songwei, et al.
Published: (2026)

MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?
by: Wang, Yuandong, et al.
Published: (2025)

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
by: Setlur, Amrith, et al.
Published: (2024)