Saved in:
| Main Authors: | Jiao, Siwen, Lv, Tianxiong, Qian, Kangan, Zhao, Chenxu, Zhu, Xiuyuan, Li, Tianlun, Cheng, Xiaolong, Li, Jinyu, Liao, Zhihao, Cai, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.07695 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
by: Zhu, Banghua, et al.
Published: (2024)
by: Zhu, Banghua, et al.
Published: (2024)
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025)
by: Zhang, Zijing, et al.
Published: (2025)
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
by: He, Haoran, et al.
Published: (2025)
by: He, Haoran, et al.
Published: (2025)
A Novel Hierarchy of Quantum Kernel Networks on Smoothed Particle Hydrodynamics
by: Li, Yudong, et al.
Published: (2026)
by: Li, Yudong, et al.
Published: (2026)
DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing
by: Lee, Vint, et al.
Published: (2023)
by: Lee, Vint, et al.
Published: (2023)
Protocols for Verifying Smooth Strategies in Bandits and Games
by: Christ, Miranda, et al.
Published: (2025)
by: Christ, Miranda, et al.
Published: (2025)
DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing
by: Dong, Zhenyuan, et al.
Published: (2024)
by: Dong, Zhenyuan, et al.
Published: (2024)
Operator Learning for Smoothing and Forecasting
by: Calvello, Edoardo, et al.
Published: (2026)
by: Calvello, Edoardo, et al.
Published: (2026)
TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
by: Zhang, Dan, et al.
Published: (2025)
by: Zhang, Dan, et al.
Published: (2025)
Actial: Activate Spatial Reasoning Ability of Multimodal Large Language Models
by: Zhan, Xiaoyu, et al.
Published: (2025)
by: Zhan, Xiaoyu, et al.
Published: (2025)
Randomized Smoothing Meets Vision-Language Models
by: Seferis, Emmanouil, et al.
Published: (2025)
by: Seferis, Emmanouil, et al.
Published: (2025)
Artistic Neural Style Transfer Algorithms with Activation Smoothing
by: Li, Xiangtian, et al.
Published: (2024)
by: Li, Xiangtian, et al.
Published: (2024)
Preconditioning and Reduced-Order Modeling of Navier-Stokes Equations in Complex Porous Microstructures
by: Li, Kangan, et al.
Published: (2025)
by: Li, Kangan, et al.
Published: (2025)
SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization
by: Li, Jiashun, et al.
Published: (2026)
by: Li, Jiashun, et al.
Published: (2026)
Smoothness Adaptivity in Constant-Depth Neural Networks: Optimal Rates via Smooth Activations
by: Liu, Yuhao, et al.
Published: (2026)
by: Liu, Yuhao, et al.
Published: (2026)
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
by: Zhou, Sashuai, et al.
Published: (2026)
by: Zhou, Sashuai, et al.
Published: (2026)
Compiler Bugs Detection in Logic Synthesis Tools via Linear Upper Confidence Bound
by: Zeng, Hui, et al.
Published: (2025)
by: Zeng, Hui, et al.
Published: (2025)
Pioglitazone Regulates Chondrocyte Metabolism and Attenuates Osteoarthritis by Activating Peroxisome Proliferator‐Activated Receptor Gamma
by: Jiaqi Shi, et al.
Published: (2025)
by: Jiaqi Shi, et al.
Published: (2025)
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
by: Wang, Peisong, et al.
Published: (2025)
by: Wang, Peisong, et al.
Published: (2025)
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
by: Li, Tianle, et al.
Published: (2025)
by: Li, Tianle, et al.
Published: (2025)
Operator Deep Smoothing for Implied Volatility
by: Wiedemann, Ruben, et al.
Published: (2024)
by: Wiedemann, Ruben, et al.
Published: (2024)
ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
by: Zhao, Weibo, et al.
Published: (2024)
by: Zhao, Weibo, et al.
Published: (2024)
Video Models Can Reason with Verifiable Rewards
by: Zhu, Tinghui, et al.
Published: (2026)
by: Zhu, Tinghui, et al.
Published: (2026)
ABot-OCR Technical Report
by: Jiang, Kaitao, et al.
Published: (2026)
by: Jiang, Kaitao, et al.
Published: (2026)
SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models
by: Cheng, An-Chieh, et al.
Published: (2024)
by: Cheng, An-Chieh, et al.
Published: (2024)
How Cars Move: Analyzing Driving Dynamics for Safer Urban Traffic
by: Qian, Kangan, et al.
Published: (2024)
by: Qian, Kangan, et al.
Published: (2024)
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization
by: Cheng, Junhao, et al.
Published: (2026)
by: Cheng, Junhao, et al.
Published: (2026)
Randomness of Shapes and Statistical Inference on Shapes via the Smooth Euler Characteristic Transform
by: Meng, Kun, et al.
Published: (2022)
by: Meng, Kun, et al.
Published: (2022)
Regularizing Differentiable Architecture Search with Smooth Activation
by: Zhou, Yanlin, et al.
Published: (2025)
by: Zhou, Yanlin, et al.
Published: (2025)
Verifiable Process Rewards for Agentic Reasoning
by: Yuan, Huining, et al.
Published: (2026)
by: Yuan, Huining, et al.
Published: (2026)
Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks
by: Yang, Guangrui, et al.
Published: (2024)
by: Yang, Guangrui, et al.
Published: (2024)
EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
by: Jiao, Siwen, et al.
Published: (2025)
by: Jiao, Siwen, et al.
Published: (2025)
Verifying the Smoothness of Graph Signals: A Graph Signal Processing Approach
by: Dabush, Lital, et al.
Published: (2023)
by: Dabush, Lital, et al.
Published: (2023)
Few-Shot Vision-Language Reasoning for Satellite Imagery via Verifiable Rewards
by: Koksal, Aybora, et al.
Published: (2025)
by: Koksal, Aybora, et al.
Published: (2025)
Partial Smoothness, Subdifferentials and Set-valued Operators
by: Qin, Ziqi, et al.
Published: (2025)
by: Qin, Ziqi, et al.
Published: (2025)
Kernel Smoothing Operators on Thick Open Domains
by: Giannakis, Dimitrios, et al.
Published: (2024)
by: Giannakis, Dimitrios, et al.
Published: (2024)
PSS-BA: LiDAR Bundle Adjustment with Progressive Spatial Smoothing
by: Li, Jianping, et al.
Published: (2024)
by: Li, Jianping, et al.
Published: (2024)
Diffeological Smoothness in Hodge Theory
by: Li, Jiayong
Published: (2009)
by: Li, Jiayong
Published: (2009)
Smooth Non-Stationary Bandits
by: Jia, Su, et al.
Published: (2023)
by: Jia, Su, et al.
Published: (2023)
Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
by: Sato, Ryota, et al.
Published: (2025)
by: Sato, Ryota, et al.
Published: (2025)
Similar Items
-
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
by: Zhu, Banghua, et al.
Published: (2024) -
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025) -
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards
by: He, Haoran, et al.
Published: (2025) -
A Novel Hierarchy of Quantum Kernel Networks on Smoothed Particle Hydrodynamics
by: Li, Yudong, et al.
Published: (2026) -
DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing
by: Lee, Vint, et al.
Published: (2023)