:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Liming, Huang, Binxuan, Zhang, Zixuan, Liu, Xin, Yin, Bing, Zhao, Tuo
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.06428
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Minimalist Example of Edge-of-Stability and Progressive Sharpening
by: Liu, Liming, et al.
Published: (2025)

Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
by: Zhang, Qingru, et al.
Published: (2025)

Diffusion Model for Manifold Data: Score Decomposition, Curvature, and Statistical Complexity
by: Zhang, Zixuan, et al.
Published: (2026)

Look-Back: Implicit Visual Re-focusing in MLLM Reasoning
by: Yang, Shuo, et al.
Published: (2025)

FBQuant: FeedBack Quantization for Large Language Models
by: Liu, Yijiang, et al.
Published: (2025)

Back to the Future: Look-ahead Augmentation and Parallel Self-Refinement for Time Series Forecasting
by: Kim, Sunho, et al.
Published: (2026)

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs
by: Liu, Liming, et al.
Published: (2025)

Expert Upcycling: Shifting the Compute-Efficient Frontier of Mixture-of-Experts
by: Dwivedi, Chaitanya, et al.
Published: (2026)

RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction
by: Ai, Xingyu, et al.
Published: (2024)

To the Noise and Back: Diffusion for Shared Autonomy
by: Yoneda, Takuma, et al.
Published: (2023)

Back-Projection Diffusion: Solving the Wideband Inverse Scattering Problem with Diffusion Models
by: Zhang, Borong, et al.
Published: (2024)

Look Back for More: Harnessing Historical Sequential Updates for Personalized Federated Adapter Tuning
by: Peng, Danni, et al.
Published: (2025)

Distilling Tool Knowledge into Language Models via Back-Translated Traces
by: Huang, Xingyue, et al.
Published: (2025)

Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks
by: Zhao, Yequan, et al.
Published: (2025)

Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient
by: Gao, Yuan, et al.
Published: (2024)

Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
by: Ramachandran, Sai Niranjan, et al.
Published: (2026)

Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain
by: Liu, Wei, et al.
Published: (2026)

NorMuon: Making Muon more efficient and scalable
by: Li, Zichong, et al.
Published: (2025)

Corrective Diffusion Language Models
by: Zhang, Shuibai, et al.
Published: (2025)

TurnBack: A Geospatial Route Cognition Benchmark for Large Language Models through Reverse Route
by: Luo, Hongyi, et al.
Published: (2025)

BackSlash: Rate Constrained Optimized Training of Large Language Models
by: Wu, Jun, et al.
Published: (2025)

Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning
by: Liu, Yongshuai, et al.
Published: (2025)

Investigating Regularization of Self-Play Language Models
by: Alami, Reda, et al.
Published: (2024)

Understanding Foundation Models: Are We Back in 1924?
by: Smeaton, Alan F.
Published: (2024)

$π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data
by: Zhang, Yaocheng, et al.
Published: (2026)

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training
by: Zhuang, Yuchen, et al.
Published: (2025)

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
by: Wang, Xinyi, et al.
Published: (2024)

SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models
by: Wang, Yibo, et al.
Published: (2025)

Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
by: Moskvoretskii, Viktor, et al.
Published: (2025)

Adapters Strike Back
by: Steitz, Jan-Martin O., et al.
Published: (2024)

D3LM: A Discrete DNA Diffusion Language Model for Bidirectional DNA Understanding and Generation
by: Yang, Zhao, et al.
Published: (2026)

BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation
by: Xu, Jingwen, et al.
Published: (2026)

Self-Rewarding Sequential Monte Carlo for Masked Diffusion Language Models
by: Luo, Ziwei, et al.
Published: (2026)

On the Convergence of Moral Self-Correction in Large Language Models
by: Liu, Guangliang, et al.
Published: (2025)

FishBack: Pullback Fisher Geometry for Optimal Activation Steering in Transformers
by: Wang, Sihan, et al.
Published: (2026)

You Only Look One Step: Accelerating Backpropagation in Diffusion Sampling with Gradient Shortcuts
by: Dou, Hongkun, et al.
Published: (2025)

Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning
by: Shan, Zikang, et al.
Published: (2026)

HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning
by: Wang, Weiqi, et al.
Published: (2026)

Bidirectional Normalizing Flow: From Data to Noise and Back
by: Lu, Yiyang, et al.
Published: (2025)

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
by: Cheng, Jiale, et al.
Published: (2024)