Saved in:
| Main Authors: | Dang, Sizhe, Shao, Jiaqi, Zheng, Xiaodong, Dai, Guang, Song, Yan, Ye, Haishan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08007 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
by: Dang, Sizhe, et al.
Published: (2025)
by: Dang, Sizhe, et al.
Published: (2025)
ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026)
by: Sun, Zhishen, et al.
Published: (2026)
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
by: Zhao, Yanjun, et al.
Published: (2024)
by: Zhao, Yanjun, et al.
Published: (2024)
MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement
by: Sun, Zhishen, et al.
Published: (2025)
by: Sun, Zhishen, et al.
Published: (2025)
Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithms on Smooth Functions
by: Ye, Haishan
Published: (2025)
by: Ye, Haishan
Published: (2025)
Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithm on Stochastic Smooth Functions
by: Ye, Haishan
Published: (2025)
by: Ye, Haishan
Published: (2025)
Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback
by: Ye, Haishan
Published: (2026)
by: Ye, Haishan
Published: (2026)
Breaking the Prompt Wall (I): A Real-World Case Study of Attacking ChatGPT via Lightweight Prompt Injection
by: Chang, Xiangyu, et al.
Published: (2025)
by: Chang, Xiangyu, et al.
Published: (2025)
On the Convergence of Single-Loop Stochastic Bilevel Optimization with Approximate Implicit Differentiation
by: Zhou, Yubo, et al.
Published: (2026)
by: Zhou, Yubo, et al.
Published: (2026)
Numerical Sensitivity and Robustness: Exploring the Flaws of Mathematical Reasoning in Large Language Models
by: Sun, Zhishen, et al.
Published: (2025)
by: Sun, Zhishen, et al.
Published: (2025)
Breaking the O(mn)-Time Barrier for Vertex-Weighted Global Minimum Cut
by: Chuzhoy, Julia, et al.
Published: (2025)
by: Chuzhoy, Julia, et al.
Published: (2025)
Additive One Approximation for Minimum Degree Spanning Tree: Breaking the $O(mn)$ Time Barrier
by: Bhattacharya, Sayan, et al.
Published: (2026)
by: Bhattacharya, Sayan, et al.
Published: (2026)
High-Probability Guarantees for Random Zeroth-Order Gradient Descent on Smooth Functions
by: Ye, Haishan
Published: (2026)
by: Ye, Haishan
Published: (2026)
High-Probability Guarantees for Random Zeroth-Order (Stochastic) Gradient Descent
by: Ye, Haishan
Published: (2026)
by: Ye, Haishan
Published: (2026)
Riemannian Momentum Tracking: Distributed Optimization with Momentum on Compact Submanifolds
by: Chen, Jun, et al.
Published: (2026)
by: Chen, Jun, et al.
Published: (2026)
Zero-Order Sharpness-Aware Minimization
by: Fu, Yao, et al.
Published: (2025)
by: Fu, Yao, et al.
Published: (2025)
Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity
by: Zhou, Qihao, et al.
Published: (2024)
by: Zhou, Qihao, et al.
Published: (2024)
Adapprox: Adaptive Approximation in Adam Optimization via Randomized Low-Rank Matrices
by: Zhao, Pengxiang, et al.
Published: (2024)
by: Zhao, Pengxiang, et al.
Published: (2024)
AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning
by: Coquelin, Daniel, et al.
Published: (2024)
by: Coquelin, Daniel, et al.
Published: (2024)
Convergence Rate Analysis of the AdamW-Style Shampoo: Unifying One-Sided and Two-Sided Preconditioning
by: Li, Huan, et al.
Published: (2026)
by: Li, Huan, et al.
Published: (2026)
Photocatalytic Radical‐Polar Crossover Enables Modular Access to Bicyclo[2.m.n]Alkane Alcohol Bioisosteres
by: Hui Ran, et al.
Published: (2026)
by: Hui Ran, et al.
Published: (2026)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
by: Di, Hao, et al.
Published: (2024)
by: Di, Hao, et al.
Published: (2024)
FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models
by: Shao, Zishan, et al.
Published: (2025)
by: Shao, Zishan, et al.
Published: (2025)
Low-Communication Resilient Distributed Estimation Algorithm Based on Memory Mechanism
by: Li, Wei, et al.
Published: (2025)
by: Li, Wei, et al.
Published: (2025)
High‐Performance Monolayer 1T‐GeO 2 Transistors with Low‐Resistance Metal Contacts
by: Shuai Lang, et al.
Published: (2025)
by: Shuai Lang, et al.
Published: (2025)
Privacy Leaks by Adversaries: Adversarial Iterations for Membership Inference Attack
by: Xue, Jing, et al.
Published: (2025)
by: Xue, Jing, et al.
Published: (2025)
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
by: Chen, Zheng-An, et al.
Published: (2025)
by: Chen, Zheng-An, et al.
Published: (2025)
Q-LocalAdam: Memory-Efficient Client-Side Adaptive Optimization for Edge Federated Learning
by: Waykole, Vedant, et al.
Published: (2026)
by: Waykole, Vedant, et al.
Published: (2026)
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications
by: Wang, Xi, et al.
Published: (2025)
by: Wang, Xi, et al.
Published: (2025)
Optimal Decentralized Composite Optimization for Convex Functions
by: Ye, Haishan, et al.
Published: (2023)
by: Ye, Haishan, et al.
Published: (2023)
Why Does Adaptive Zeroth-Order Optimization Work?
by: Ye, Haishan, et al.
Published: (2026)
by: Ye, Haishan, et al.
Published: (2026)
Stochastic Diagonal Estimation Based on Matrix Quadratic Form Oracles
by: Ye, Haishan, et al.
Published: (2025)
by: Ye, Haishan, et al.
Published: (2025)
Can a One-Point Feedback Zeroth-order Algorithm Achieve Linear Dimension Dependent Sample Complexity?
by: Ye, Haishan, et al.
Published: (2025)
by: Ye, Haishan, et al.
Published: (2025)
When Can You Get Away with Low Memory Adam?
by: Kalra, Dayal Singh, et al.
Published: (2025)
by: Kalra, Dayal Singh, et al.
Published: (2025)
Fairness and Efficiency in Two-Sided Matching Markets
by: Jain, Pallavi, et al.
Published: (2025)
by: Jain, Pallavi, et al.
Published: (2025)
AB-Cache: Training-Free Acceleration of Diffusion Models via Adams-Bashforth Cached Feature Reuse
by: Yu, Zichao, et al.
Published: (2025)
by: Yu, Zichao, et al.
Published: (2025)
Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
by: Chen, Jun, et al.
Published: (2023)
by: Chen, Jun, et al.
Published: (2023)
Stochastic Non-Smooth Non-Convex Optimization with Decision-Dependent Distributions
by: Liu, Chengchang, et al.
Published: (2026)
by: Liu, Chengchang, et al.
Published: (2026)
Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
by: Wei, Xiuying, et al.
Published: (2024)
by: Wei, Xiuying, et al.
Published: (2024)
Multiomics analyses reveal the mechanisms of the responses of subalpine treeline trees to phenology and winter low‐temperature stress
by: Dongyue Yu, et al.
Published: (2024)
by: Dongyue Yu, et al.
Published: (2024)
Similar Items
-
FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed
by: Dang, Sizhe, et al.
Published: (2025) -
ESSAM: A Novel Competitive Evolution Strategies Approach to Reinforcement Learning for Memory Efficient LLMs Fine-Tuning
by: Sun, Zhishen, et al.
Published: (2026) -
Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer
by: Zhao, Yanjun, et al.
Published: (2024) -
MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement
by: Sun, Zhishen, et al.
Published: (2025) -
Explicit and Non-asymptotic Query Complexities of Rank-Based Zeroth-order Algorithms on Smooth Functions
by: Ye, Haishan
Published: (2025)