:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Yuxuan, Zhou, Ziming, Xu, Boyu, Liu, Beijie, Xu, Runhui, Huang, Peng
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.14813
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning
by: Xu, Yuanda, et al.
Published: (2026)

A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training
by: Yan, Chengcheng, et al.
Published: (2025)

Training Proactive and Personalized LLM Agents
by: Sun, Weiwei, et al.
Published: (2025)

IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning
by: Luo, Haohao, et al.
Published: (2026)

DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion
by: Lou, Yuxuan, et al.
Published: (2026)

Learning from Complexity: Exploring Dynamic Sample Pruning of Spatio-Temporal Training
by: Chen, Wei, et al.
Published: (2026)

Silent Neuron Theory and Plasticity Preservation for Deep Reinforcement Learning in Adaptive Video Streaming
by: He, Zhiqiang, et al.
Published: (2025)

Approximated Likelihood Ratio: A Forward-Only and Parallel Framework for Boosting Neural Network Training
by: Zhang, Zeliang, et al.
Published: (2024)

Segmental Advantage Estimation: Enhancing PPO for Long-Context LLM Training
by: Gong, Xue, et al.
Published: (2026)

Efficient Deep Learning Board: Training Feedback Is Not All You Need
by: Gong, Lina, et al.
Published: (2024)

Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple Actions
by: Gan, Guo, et al.
Published: (2026)

Reinforcement Learning on Pre-Training Data
by: Li, Siheng, et al.
Published: (2025)

Neural Thermodynamic Laws for Large Language Model Training
by: Liu, Ziming, et al.
Published: (2025)

CAdam: Confidence-Based Optimization for Online Learning
by: Wang, Shaowen, et al.
Published: (2024)

Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
by: Zhang, Zhi, et al.
Published: (2024)

Low-redundancy Distillation for Continual Learning
by: Liu, RuiQi, et al.
Published: (2023)

Tools Fail: Detecting Silent Errors in Faulty Tools
by: Sun, Jimin, et al.
Published: (2024)

Z-Error Loss for Training Neural Networks
by: Godin, Guillaume
Published: (2025)

A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
by: Xu, Junzhou, et al.
Published: (2025)

Native Fortran Implementation of TensorFlow-Trained Deep and Bayesian Neural Networks
by: Furlong, Aidan, et al.
Published: (2025)

Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon
by: Xu, Tianshuo, et al.
Published: (2024)

Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks
by: Hu, Rui, et al.
Published: (2024)

On the Interplay Between Sparsity and Training in Deep Reinforcement Learning
by: Davelouis, Fatima, et al.
Published: (2025)

Efficient Multi-Task Modeling through Automated Fusion of Trained Models
by: Zhou, Jingxuan, et al.
Published: (2025)

To Train or Not to Train: Balancing Efficiency and Training Cost in Deep Reinforcement Learning for Mobile Edge Computing
by: Boscaro, Maddalena, et al.
Published: (2024)

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Reward Models
by: Zhu, Xiao, et al.
Published: (2026)

Preparing Lessons for Progressive Training on Language Models
by: Pan, Yu, et al.
Published: (2024)

Symmetry-Aware Transformer Training for Automated Planning
by: Fritzsche, Markus, et al.
Published: (2025)

ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
by: Zeng, Xingshan, et al.
Published: (2025)

TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning
by: Zhang, Junru, et al.
Published: (2025)

ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
by: Hou, Hongru, et al.
Published: (2026)

Open-World Test-Time Training: Self-Training with Contrast Learning
by: Su, Houcheng, et al.
Published: (2024)

POINT$^{2}$: A Polymer Informatics Training and Testing Database
by: Xu, Jiaxin, et al.
Published: (2025)

The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
by: Fluri, Lukas, et al.
Published: (2024)

Enhancing Deep Learning with Optimized Gradient Descent: Bridging Numerical Methods and Neural Network Training
by: Ma, Yuhan, et al.
Published: (2024)

Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
by: Chhabra, Anshuman, et al.
Published: (2024)

Online Training and Pruning of Deep Reinforcement Learning Networks
by: Guenter, Valentin Frank Ingmar, et al.
Published: (2025)

Exploring Dynamic Properties of Backdoor Training Through Information Bottleneck
by: Liu, Xinyu, et al.
Published: (2025)

Sparse Training for Federated Learning with Regularized Error Correction
by: Greidi, Ran, et al.
Published: (2023)

Training Large Language Models to Reason via EM Policy Gradient
by: Xu, Tianbing
Published: (2025)