Saved in:
| Main Authors: | Jiang, Yuxuan, Zhou, Ziming, Xu, Boyu, Liu, Beijie, Xu, Runhui, Huang, Peng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.14813 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning
by: Xu, Yuanda, et al.
Published: (2026)
by: Xu, Yuanda, et al.
Published: (2026)
A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training
by: Yan, Chengcheng, et al.
Published: (2025)
by: Yan, Chengcheng, et al.
Published: (2025)
Training Proactive and Personalized LLM Agents
by: Sun, Weiwei, et al.
Published: (2025)
by: Sun, Weiwei, et al.
Published: (2025)
IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning
by: Luo, Haohao, et al.
Published: (2026)
by: Luo, Haohao, et al.
Published: (2026)
DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion
by: Lou, Yuxuan, et al.
Published: (2026)
by: Lou, Yuxuan, et al.
Published: (2026)
Learning from Complexity: Exploring Dynamic Sample Pruning of Spatio-Temporal Training
by: Chen, Wei, et al.
Published: (2026)
by: Chen, Wei, et al.
Published: (2026)
Silent Neuron Theory and Plasticity Preservation for Deep Reinforcement Learning in Adaptive Video Streaming
by: He, Zhiqiang, et al.
Published: (2025)
by: He, Zhiqiang, et al.
Published: (2025)
Approximated Likelihood Ratio: A Forward-Only and Parallel Framework for Boosting Neural Network Training
by: Zhang, Zeliang, et al.
Published: (2024)
by: Zhang, Zeliang, et al.
Published: (2024)
Segmental Advantage Estimation: Enhancing PPO for Long-Context LLM Training
by: Gong, Xue, et al.
Published: (2026)
by: Gong, Xue, et al.
Published: (2026)
Efficient Deep Learning Board: Training Feedback Is Not All You Need
by: Gong, Lina, et al.
Published: (2024)
by: Gong, Lina, et al.
Published: (2024)
Android Coach: Improve Online Agentic Training Efficiency with Single State Multiple Actions
by: Gan, Guo, et al.
Published: (2026)
by: Gan, Guo, et al.
Published: (2026)
Reinforcement Learning on Pre-Training Data
by: Li, Siheng, et al.
Published: (2025)
by: Li, Siheng, et al.
Published: (2025)
Neural Thermodynamic Laws for Large Language Model Training
by: Liu, Ziming, et al.
Published: (2025)
by: Liu, Ziming, et al.
Published: (2025)
CAdam: Confidence-Based Optimization for Online Learning
by: Wang, Shaowen, et al.
Published: (2024)
by: Wang, Shaowen, et al.
Published: (2024)
Proactive Gradient Conflict Mitigation in Multi-Task Learning: A Sparse Training Perspective
by: Zhang, Zhi, et al.
Published: (2024)
by: Zhang, Zhi, et al.
Published: (2024)
Low-redundancy Distillation for Continual Learning
by: Liu, RuiQi, et al.
Published: (2023)
by: Liu, RuiQi, et al.
Published: (2023)
Tools Fail: Detecting Silent Errors in Faulty Tools
by: Sun, Jimin, et al.
Published: (2024)
by: Sun, Jimin, et al.
Published: (2024)
Z-Error Loss for Training Neural Networks
by: Godin, Guillaume
Published: (2025)
by: Godin, Guillaume
Published: (2025)
A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning
by: Xu, Junzhou, et al.
Published: (2025)
by: Xu, Junzhou, et al.
Published: (2025)
Native Fortran Implementation of TensorFlow-Trained Deep and Bayesian Neural Networks
by: Furlong, Aidan, et al.
Published: (2025)
by: Furlong, Aidan, et al.
Published: (2025)
Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon
by: Xu, Tianshuo, et al.
Published: (2024)
by: Xu, Tianshuo, et al.
Published: (2024)
Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks
by: Hu, Rui, et al.
Published: (2024)
by: Hu, Rui, et al.
Published: (2024)
On the Interplay Between Sparsity and Training in Deep Reinforcement Learning
by: Davelouis, Fatima, et al.
Published: (2025)
by: Davelouis, Fatima, et al.
Published: (2025)
Efficient Multi-Task Modeling through Automated Fusion of Trained Models
by: Zhou, Jingxuan, et al.
Published: (2025)
by: Zhou, Jingxuan, et al.
Published: (2025)
To Train or Not to Train: Balancing Efficiency and Training Cost in Deep Reinforcement Learning for Mobile Edge Computing
by: Boscaro, Maddalena, et al.
Published: (2024)
by: Boscaro, Maddalena, et al.
Published: (2024)
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Reward Models
by: Zhu, Xiao, et al.
Published: (2026)
by: Zhu, Xiao, et al.
Published: (2026)
Preparing Lessons for Progressive Training on Language Models
by: Pan, Yu, et al.
Published: (2024)
by: Pan, Yu, et al.
Published: (2024)
Symmetry-Aware Transformer Training for Automated Planning
by: Fritzsche, Markus, et al.
Published: (2025)
by: Fritzsche, Markus, et al.
Published: (2025)
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
by: Zeng, Xingshan, et al.
Published: (2025)
by: Zeng, Xingshan, et al.
Published: (2025)
TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning
by: Zhang, Junru, et al.
Published: (2025)
by: Zhang, Junru, et al.
Published: (2025)
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
by: Hou, Hongru, et al.
Published: (2026)
by: Hou, Hongru, et al.
Published: (2026)
Open-World Test-Time Training: Self-Training with Contrast Learning
by: Su, Houcheng, et al.
Published: (2024)
by: Su, Houcheng, et al.
Published: (2024)
POINT$^{2}$: A Polymer Informatics Training and Testing Database
by: Xu, Jiaxin, et al.
Published: (2025)
by: Xu, Jiaxin, et al.
Published: (2025)
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
by: Fluri, Lukas, et al.
Published: (2024)
by: Fluri, Lukas, et al.
Published: (2024)
Enhancing Deep Learning with Optimized Gradient Descent: Bridging Numerical Methods and Neural Network Training
by: Ma, Yuhan, et al.
Published: (2024)
by: Ma, Yuhan, et al.
Published: (2024)
Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
by: Chhabra, Anshuman, et al.
Published: (2024)
by: Chhabra, Anshuman, et al.
Published: (2024)
Online Training and Pruning of Deep Reinforcement Learning Networks
by: Guenter, Valentin Frank Ingmar, et al.
Published: (2025)
by: Guenter, Valentin Frank Ingmar, et al.
Published: (2025)
Exploring Dynamic Properties of Backdoor Training Through Information Bottleneck
by: Liu, Xinyu, et al.
Published: (2025)
by: Liu, Xinyu, et al.
Published: (2025)
Sparse Training for Federated Learning with Regularized Error Correction
by: Greidi, Ran, et al.
Published: (2023)
by: Greidi, Ran, et al.
Published: (2023)
Training Large Language Models to Reason via EM Policy Gradient
by: Xu, Tianbing
Published: (2025)
by: Xu, Tianbing
Published: (2025)
Similar Items
-
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning
by: Xu, Yuanda, et al.
Published: (2026) -
A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training
by: Yan, Chengcheng, et al.
Published: (2025) -
Training Proactive and Personalized LLM Agents
by: Sun, Weiwei, et al.
Published: (2025) -
IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning
by: Luo, Haohao, et al.
Published: (2026) -
DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion
by: Lou, Yuxuan, et al.
Published: (2026)