:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Zhuohan, Zhu, Ziwei, Li, Ziniu, Chen, Congliang, Han, Yizhou, Lin, Yufeng, Lin, Zhihang, Gu, Angyang, Hu, Xinglin, Sun, Ruoyu, Ding, Tian
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2510.27610
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
by: Li, Ziniu, et al.
Published: (2025)

Bridging Formal Language with Chain-of-Thought Reasoning to Geometry Problem Solving
by: Yang, Tianyun, et al.
Published: (2025)

Why Transformers Need Adam: A Hessian Perspective
by: Zhang, Yushun, et al.
Published: (2024)

Preserving Diversity in Supervised Fine-Tuning of Large Language Models
by: Li, Ziniu, et al.
Published: (2024)

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
by: Li, Ziniu, et al.
Published: (2023)

Adam-mini: Use Fewer Learning Rates To Gain More
by: Zhang, Yushun, et al.
Published: (2024)

Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives
by: Liu, Yajiao, et al.
Published: (2025)

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
by: Chen, Yupeng, et al.
Published: (2024)

Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inference
by: Qin, Zongyue, et al.
Published: (2024)

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques
by: Tang, Zhengyang, et al.
Published: (2025)

Adam Converges Without Any Modification On Update Rules
by: Zhang, Yushun, et al.
Published: (2026)

Changes of the Primary Cilia in Alzheimer's Disease Pathogenesis
by: Angyang Guo, et al.
Published: (2025)

A Survey on Graph Neural Network Acceleration: Algorithms, Systems, and Customized Hardware
by: Zhang, Shichang, et al.
Published: (2023)

A Study in Markov Chains, Loop-Erased Random Walk and Loop Soups
by: Gu, Zhuohan
Published: (2024)

Teaching Language Models to Reason with Tools
by: Li, Chengpeng, et al.
Published: (2025)

Self-Evolving Critique Abilities in Large Language Models
by: Tang, Zhengyang, et al.
Published: (2025)

MMInA: Benchmarking Multihop Multimodal Internet Agents
by: Tian, Shulin, et al.
Published: (2024)

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
by: Lin, Zongyu, et al.
Published: (2025)

Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis
by: Qin, Zongyue, et al.
Published: (2024)

Automated Molecular Concept Generation and Labeling with Large Language Models
by: Zhang, Zimin, et al.
Published: (2024)

CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models
by: Lin, Zhihang, et al.
Published: (2025)

Theoretical and Empirical Insights into the Origins of Degree Bias in Graph Neural Networks
by: Subramonian, Arjun, et al.
Published: (2024)

SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
by: Wang, Xiaoxuan, et al.
Published: (2023)

Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models
by: Ding, Zeyang, et al.
Published: (2026)

RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?
by: Dai, Yuyang, et al.
Published: (2026)

CoRT: Code-integrated Reasoning within Thinking
by: Li, Chengpeng, et al.
Published: (2025)

Policy Optimization in RLHF: The Impact of Out-of-preference Data
by: Li, Ziniu, et al.
Published: (2023)

Beyond Progress Measures: Theoretical Insights into the Mechanism of Grokking
by: Gu, Zihan, et al.
Published: (2025)

Exploring and Improving Initialization for Deep Graph Neural Networks: A Signal Propagation Perspective
by: Wang, Senmiao, et al.
Published: (2025)

Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis
by: Yang, Xinrui, et al.
Published: (2024)

Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models
by: Xu, Zihao, et al.
Published: (2026)

Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference
by: Lin, Zhihang, et al.
Published: (2024)

Off-Policy Value-Based Reinforcement Learning for Large Language Models
by: Wang, Peng-Yuan, et al.
Published: (2026)

Exact Causal Attention with 10% Fewer Operations
by: Rybin, Dmitry, et al.
Published: (2025)

An Extended Space‐Time Network With Explicit Incompatibility Modelling for High‐Speed Railway Timetabling
by: Angyang Chen, et al.
Published: (2025)

Speculative Decoding Reimagined for Multimodal Large Language Models
by: Lin, Luxi, et al.
Published: (2025)

Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling
by: Huang, Zijie, et al.
Published: (2024)

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
by: Chen, Peter, et al.
Published: (2025)

The Paradox of Outcome Optimization: A Causal Information-Theoretic Bound on Reasoning Shortcuts in LLMs
by: Chen, Zihan, et al.
Published: (2026)

Direct Behavior Optimization: Unlocking the Potential of Lightweight LLMs
by: Yang, Hongming, et al.
Published: (2025)