Saved in:
| Main Authors: | Yan, Renye, Gan, Yaozhong, Wu, You, Xing, Junliang, Liangn, Ling, Zhu, Yeshang, Cai, Yimao |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.04498 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective
by: Yan, Renye, et al.
Published: (2024)
by: Yan, Renye, et al.
Published: (2024)
Reflective Policy Optimization
by: Gan, Yaozhong, et al.
Published: (2024)
by: Gan, Yaozhong, et al.
Published: (2024)
Transductive Off-policy Proximal Policy Optimization
by: Gan, Yaozhong, et al.
Published: (2024)
by: Gan, Yaozhong, et al.
Published: (2024)
MARPO: A Reflective Policy Optimization for Multi Agent Reinforcement Learning
by: Wu, Cuiling, et al.
Published: (2025)
by: Wu, Cuiling, et al.
Published: (2025)
Do Less, Achieve More: Do We Need Every-Step Optimization for RL Fine-tuning of Diffusion Models?
by: Yan, Renye, et al.
Published: (2026)
by: Yan, Renye, et al.
Published: (2026)
Memento 2: Learning by Stateful Reflective Memory
by: Wang, Jun
Published: (2025)
by: Wang, Jun
Published: (2025)
A Sanity Check for Multi-In-Domain Face Forgery Detection in the Real World
by: Cheng, Jikang, et al.
Published: (2025)
by: Cheng, Jikang, et al.
Published: (2025)
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
by: Lv, Kai, et al.
Published: (2023)
by: Lv, Kai, et al.
Published: (2023)
Synergizing Reinforcement Learning and Genetic Algorithms for Neural Combinatorial Optimization
by: Gu, Shengda, et al.
Published: (2025)
by: Gu, Shengda, et al.
Published: (2025)
AdaMuon: Adaptive Muon Optimizer
by: Si, Chongjie, et al.
Published: (2025)
by: Si, Chongjie, et al.
Published: (2025)
Efficient Multi-Task Reinforcement Learning with Cross-Task Policy Guidance
by: He, Jinmin, et al.
Published: (2025)
by: He, Jinmin, et al.
Published: (2025)
AdaCubic: An Adaptive Cubic Regularization Optimizer for Deep Learning
by: Tsingalis, Ioannis, et al.
Published: (2026)
by: Tsingalis, Ioannis, et al.
Published: (2026)
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
by: Hu, Xixi, et al.
Published: (2024)
by: Hu, Xixi, et al.
Published: (2024)
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning
by: Lou, Chenwei, et al.
Published: (2025)
by: Lou, Chenwei, et al.
Published: (2025)
AdaTKG: Adaptive Memory for Temporal Knowledge Graph Reasoning
by: Lee, Seunghan, et al.
Published: (2026)
by: Lee, Seunghan, et al.
Published: (2026)
AdaMuS: Adaptive Multi-view Sparsity Learning for Dimensionally Unbalanced Data
by: Xu, Cai, et al.
Published: (2026)
by: Xu, Cai, et al.
Published: (2026)
AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control
by: Bui, Quang-Hung, et al.
Published: (2025)
by: Bui, Quang-Hung, et al.
Published: (2025)
Memento-Skills: Let Agents Design Agents
by: Zhou, Huichi, et al.
Published: (2026)
by: Zhou, Huichi, et al.
Published: (2026)
BiBLDR: Bidirectional Behavior Learning for Drug Repositioning
by: Zhang, Renye, et al.
Published: (2025)
by: Zhang, Renye, et al.
Published: (2025)
Inner-Probe: Discovering Copyright-related Data Generation in LLM Architecture
by: Ma, Qichao, et al.
Published: (2024)
by: Ma, Qichao, et al.
Published: (2024)
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs
by: Zhou, Huichi, et al.
Published: (2025)
by: Zhou, Huichi, et al.
Published: (2025)
AdaCL:Adaptive Continual Learning
by: Yildirim, Elif Ceren Gok, et al.
Published: (2023)
by: Yildirim, Elif Ceren Gok, et al.
Published: (2023)
Ada-MSHyper: Adaptive Multi-Scale Hypergraph Transformer for Time Series Forecasting
by: Shang, Zongjiang, et al.
Published: (2024)
by: Shang, Zongjiang, et al.
Published: (2024)
AdaKernel: Learning Adaptive Kernel Parameters for Spatiotemporal Graph Neural Networks
by: Zhang, Zhongyue, et al.
Published: (2026)
by: Zhang, Zhongyue, et al.
Published: (2026)
AdaMeZO: Adam-style Zeroth-Order Optimizer for LLM Fine-tuning Without Maintaining the Moments
by: Cai, Zhijie, et al.
Published: (2026)
by: Cai, Zhijie, et al.
Published: (2026)
AdaCuRL: Adaptive Curriculum Reinforcement Learning with Invalid Sample Mitigation and Historical Revisiting
by: Li, Renda, et al.
Published: (2025)
by: Li, Renda, et al.
Published: (2025)
AdaFair-MARL: Enforcing Adaptive Fairness Constraints in Multi-Agent Reinforcement Learning
by: Ekpo, Promise, et al.
Published: (2025)
by: Ekpo, Promise, et al.
Published: (2025)
Temperature as a Meta-Policy: Adaptive Temperature in LLM Reinforcement Learning
by: Dang, Haoran, et al.
Published: (2026)
by: Dang, Haoran, et al.
Published: (2026)
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
by: Xi, Zhiheng, et al.
Published: (2025)
by: Xi, Zhiheng, et al.
Published: (2025)
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents
by: Zeng, Ziyun, et al.
Published: (2026)
by: Zeng, Ziyun, et al.
Published: (2026)
Network Topology Optimization via Deep Reinforcement Learning
by: Li, Zhuoran, et al.
Published: (2022)
by: Li, Zhuoran, et al.
Published: (2022)
AdaFisher: Adaptive Second Order Optimization via Fisher Information
by: Gomes, Damien Martins, et al.
Published: (2024)
by: Gomes, Damien Martins, et al.
Published: (2024)
Belief-Based Offline Reinforcement Learning for Delay-Robust Policy Optimization
by: Zhan, Simon Sinong, et al.
Published: (2025)
by: Zhan, Simon Sinong, et al.
Published: (2025)
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning
by: Wang, Hongjun, et al.
Published: (2026)
by: Wang, Hongjun, et al.
Published: (2026)
Reinforce-Ada: An Adaptive Sampling Framework under Non-linear RL Objectives
by: Xiong, Wei, et al.
Published: (2025)
by: Xiong, Wei, et al.
Published: (2025)
AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering
by: Cai, Yuzhu, et al.
Published: (2026)
by: Cai, Yuzhu, et al.
Published: (2026)
AdaWorld: Learning Adaptable World Models with Latent Actions
by: Gao, Shenyuan, et al.
Published: (2025)
by: Gao, Shenyuan, et al.
Published: (2025)
On the Reuse Bias in Off-Policy Reinforcement Learning
by: Ying, Chengyang, et al.
Published: (2022)
by: Ying, Chengyang, et al.
Published: (2022)
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning
by: Refael, Yehonathan, et al.
Published: (2024)
by: Refael, Yehonathan, et al.
Published: (2024)
Adaptive Policy Synchronization for Scalable Reinforcement Learning
by: Lafuente-Mercado, Rodney
Published: (2025)
by: Lafuente-Mercado, Rodney
Published: (2025)
Similar Items
-
The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective
by: Yan, Renye, et al.
Published: (2024) -
Reflective Policy Optimization
by: Gan, Yaozhong, et al.
Published: (2024) -
Transductive Off-policy Proximal Policy Optimization
by: Gan, Yaozhong, et al.
Published: (2024) -
MARPO: A Reflective Policy Optimization for Multi Agent Reinforcement Learning
by: Wu, Cuiling, et al.
Published: (2025) -
Do Less, Achieve More: Do We Need Every-Step Optimization for RL Fine-tuning of Diffusion Models?
by: Yan, Renye, et al.
Published: (2026)