:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dang, Sizhe, Shao, Jiaqi, Zheng, Xiaodong, Dai, Guang, Song, Yan, Ye, Haishan
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.08007
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
by: Dang, Sizhe, et al.
Published: (2025)

ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026)

Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
by: Zhao, Yanjun, et al.
Published: (2024)

MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement
by: Sun, Zhishen, et al.
Published: (2025)

Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithms on Smooth Functions
by: Ye, Haishan
Published: (2025)

Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithm on Stochastic Smooth Functions
by: Ye, Haishan
Published: (2025)

Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback
by: Ye, Haishan
Published: (2026)

Breaking the Prompt Wall (I): A Real-World Case Study of Attacking ChatGPT via Lightweight Prompt Injection
by: Chang, Xiangyu, et al.
Published: (2025)

On the Convergence of Single-Loop Stochastic Bilevel Optimization with Approximate Implicit Differentiation
by: Zhou, Yubo, et al.
Published: (2026)

Numerical Sensitivity and Robustness: Exploring the Flaws of Mathematical Reasoning in Large Language Models
by: Sun, Zhishen, et al.
Published: (2025)

Breaking the O(mn)-Time Barrier for Vertex-Weighted Global Minimum Cut
by: Chuzhoy, Julia, et al.
Published: (2025)

Additive One Approximation for Minimum Degree Spanning Tree: Breaking the $O(mn)$ Time Barrier
by: Bhattacharya, Sayan, et al.
Published: (2026)

High-Probability Guarantees for Random Zeroth-Order Gradient Descent on Smooth Functions
by: Ye, Haishan
Published: (2026)

High-Probability Guarantees for Random Zeroth-Order (Stochastic) Gradient Descent
by: Ye, Haishan
Published: (2026)

Riemannian Momentum Tracking: Distributed Optimization with Momentum on Compact Submanifolds
by: Chen, Jun, et al.
Published: (2026)

Zero-Order Sharpness-Aware Minimization
by: Fu, Yao, et al.
Published: (2025)

Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity
by: Zhou, Qihao, et al.
Published: (2024)

Adapprox: Adaptive Approximation in Adam Optimization via Randomized Low-Rank Matrices
by: Zhao, Pengxiang, et al.
Published: (2024)

AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning
by: Coquelin, Daniel, et al.
Published: (2024)

Convergence Rate Analysis of the AdamW-Style Shampoo: Unifying One-Sided and Two-Sided Preconditioning
by: Li, Huan, et al.
Published: (2026)

Photocatalytic Radical‐Polar Crossover Enables Modular Access to Bicyclo[2.m.n]Alkane Alcohol Bioisosteres
by: Hui Ran, et al.
Published: (2026)

Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
by: Di, Hao, et al.
Published: (2024)

FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models
by: Shao, Zishan, et al.
Published: (2025)

Low-Communication Resilient Distributed Estimation Algorithm Based on Memory Mechanism
by: Li, Wei, et al.
Published: (2025)

High‐Performance Monolayer 1T‐GeO 2 Transistors with Low‐Resistance Metal Contacts
by: Shuai Lang, et al.
Published: (2025)

Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
by: Xue, Jing, et al.
Published: (2025)

From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
by: Chen, Zheng-An, et al.
Published: (2025)

Q-LocalAdam: Memory-Efficient Client-Side Adaptive Optimization for Edge Federated Learning
by: Waykole, Vedant, et al.
Published: (2026)

cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications
by: Wang, Xi, et al.
Published: (2025)

Optimal Decentralized Composite Optimization for Convex Functions
by: Ye, Haishan, et al.
Published: (2023)

Why Does Adaptive Zeroth-Order Optimization Work?
by: Ye, Haishan, et al.
Published: (2026)

Stochastic Diagonal Estimation Based on Matrix Quadratic Form Oracles
by: Ye, Haishan, et al.
Published: (2025)

Can a One-Point Feedback Zeroth-order Algorithm Achieve Linear Dimension Dependent Sample Complexity?
by: Ye, Haishan, et al.
Published: (2025)

When Can You Get Away with Low Memory Adam?
by: Kalra, Dayal Singh, et al.
Published: (2025)

Fairness and Efficiency in Two-Sided Matching Markets
by: Jain, Pallavi, et al.
Published: (2025)

AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse
by: Yu, Zichao, et al.
Published: (2025)

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
by: Chen, Jun, et al.
Published: (2023)

Stochastic Non-Smooth Non-Convex Optimization with Decision-Dependent Distributions
by: Liu, Chengchang, et al.
Published: (2026)

Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
by: Wei, Xiuying, et al.
Published: (2024)

Multiomics analyses reveal the mechanisms of the responses of subalpine treeline trees to phenology and winter low‐temperature stress
by: Dongyue Yu, et al.
Published: (2024)