:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yan, Jun, Huang, Weiquan, Zuo, Jiankai, Mo, Yujian, Fang, Xi, Wu, Chengliang, Wei, Zeming
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.26929
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Fight Back Against Jailbreaking via Prompt Adversarial Tuning
by: Mo, Yichuan, et al.
Published: (2024)

When and Why Grouping Attention Heads Accelerates Muon Optimization
by: Zhang, Hongtao, et al.
Published: (2026)

AdaMuon: Adaptive Muon Optimizer
by: Si, Chongjie, et al.
Published: (2025)

Identifying and Understanding Cross-Class Features in Adversarial Training
by: Wei, Zeming, et al.
Published: (2025)

DiM-TS: Bridge the Gap between Selective State Space Models and Time Series for Generative Modeling
by: Yao, Zihao, et al.
Published: (2025)

Phases of Muon: When Muon Eclipses SignSGD
by: Paquette, Elliot, et al.
Published: (2026)

A Theoretical Understanding of Self-Correction through In-context Alignment
by: Wang, Yifei, et al.
Published: (2024)

On the Duality Between Sharpness-Aware Minimization and Adversarial Training
by: Zhang, Yihao, et al.
Published: (2024)

LiMuon: Light and Fast Muon Optimizer for Large Models
by: Huang, Feihu, et al.
Published: (2025)

When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement Approach
by: Zhang, Zhihan, et al.
Published: (2025)

Breaking Symmetry When Training Transformers
by: Zuo, Chunsheng, et al.
Published: (2024)

When RL Meets Adaptive Speculative Training: A Unified Training-Serving System
by: Wang, Junxiong, et al.
Published: (2026)

When Invariant Representation Learning Meets Label Shift: Insufficiency and Theoretical Insights
by: Luo, You-Wei, et al.
Published: (2024)

Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models
by: Zhang, Yihao, et al.
Published: (2024)

NuMuon: Nuclear-Norm-Constrained Muon for Compressible LLM Training
by: Dolatabadi, Hadi Mohaghegh, et al.
Published: (2026)

MiMuon: Mixed Muon Optimizer with Improved Generalization for Large Models
by: Huang, Feihu, et al.
Published: (2026)

Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence
by: Fu, Shaopeng, et al.
Published: (2025)

Enhancing Adversarial Training via Reweighting Optimization Trajectory
by: Huang, Tianjin, et al.
Published: (2023)

Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025)

Democratic Training Against Universal Adversarial Perturbations
by: Sun, Bing, et al.
Published: (2025)

Decoding Large Language Diffusion Models with Foreseeing Movement
by: Mo, Yichuan, et al.
Published: (2025)

ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
by: Yang, Zonghan, et al.
Published: (2024)

Lions and Muons: Optimization via Stochastic Frank-Wolfe
by: Sfyraki, Maria-Eleni, et al.
Published: (2025)

Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective
by: Xing, Yue, et al.
Published: (2024)

When Pattern-by-Pattern Works: Theoretical and Empirical Insights for Logistic Models with Missing Values
by: Muller, Christophe, et al.
Published: (2025)

MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training
by: Li, Jiacheng, et al.
Published: (2026)

LaMsS: When Large Language Models Meet Self-Skepticism
by: Wu, Yetao, et al.
Published: (2024)

On the Convergence Analysis of Muon
by: Shen, Wei, et al.
Published: (2025)

AdaGrad Meets Muon: Adaptive Stepsizes for Orthogonal Updates
by: Zhang, Minxin, et al.
Published: (2025)

When and Why Adversarial Training Improves PINNs: A Neural Tangent Kernel Perspective
by: Cao, Yuan-dong, et al.
Published: (2026)

Active Learning For Contextual Linear Optimization: A Margin-Based Approach
by: Liu, Mo, et al.
Published: (2023)

SignMuon: Communication-Efficient Distributed Muon Optimization
by: Mishra, Neel, et al.
Published: (2026)

Effective Quantization of Muon Optimizer States
by: Gupta, Aman, et al.
Published: (2025)

MuonQ: Enhancing Low-Bit Muon Quantization via Directional Fidelity Optimization
by: Su, Yupeng, et al.
Published: (2026)

Adversarial Instance Generation and Robust Training for Neural Combinatorial Optimization with Multiple Objectives
by: Liu, Wei, et al.
Published: (2026)

The Newton-Muon Optimizer
by: Du, Zhehang, et al.
Published: (2026)

A Dynamic Stiefel Graph Neural Network for Efficient Spatio-Temporal Time Series Forecasting
by: Zheng, Jiankai, et al.
Published: (2025)

On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability
by: Zheng, Chenyu, et al.
Published: (2024)

MuCon: Clipped Muon Updates for LLM Training
by: Yi, Albert
Published: (2026)

Information Theoretic Adversarial Training of Large Language Models
by: Zhang, Yiwei, et al.
Published: (2026)