:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Wu, Qingyuan, Wang, Yuhui, Zhan, Simon Sinong, Wang, Yixuan, Lin, Chung-Wei, Lv, Chen, Zhu, Qi, Schmidhuber, Jürgen, Huang, Chao
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Machine Learning
Accesso online:	https://arxiv.org/abs/2505.00546
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
di: Wu, Qingyuan, et al.
Pubblicazione: (2024)

Variational Delayed Policy Optimization
di: Wu, Qingyuan, et al.
Pubblicazione: (2024)

Belief-Based Offline Reinforcement Learning for Delay-Robust Policy Optimization
di: Zhan, Simon Sinong, et al.
Pubblicazione: (2025)

Inverse Delayed Reinforcement Learning
di: Zhan, Simon Sinong, et al.
Pubblicazione: (2024)

Enhancing Inverse Reinforcement Learning through Encoding Dynamic Information in Reward Shaping
di: Zhan, Simon Sinong, et al.
Pubblicazione: (2024)

Case Study: Runtime Safety Verification of Neural Network Controlled System
di: Yang, Frank, et al.
Pubblicazione: (2024)

A Unified Framework for Rethinking Policy Divergence Measures in GRPO
di: Wu, Qingyuan, et al.
Pubblicazione: (2026)

Highway Reinforcement Learning
di: Wang, Yuhui, et al.
Pubblicazione: (2024)

Highway Value Iteration Networks
di: Wang, Yuhui, et al.
Pubblicazione: (2024)

Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning
di: Wang, Yuhui, et al.
Pubblicazione: (2024)

Empowering Autonomous Driving with Large Language Models: A Safety Perspective
di: Wang, Yixuan, et al.
Pubblicazione: (2023)

Token Buncher: Shielding LLMs from Harmful Reinforcement Learning Fine-Tuning
di: Feng, Weitao, et al.
Pubblicazione: (2025)

Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization
di: Dai, Yanning, et al.
Pubblicazione: (2026)

Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving
di: Huang, Zhiyu, et al.
Pubblicazione: (2024)

Sequence Compression Speeds Up Credit Assignment in Reinforcement Learning
di: Ramesh, Aditya A., et al.
Pubblicazione: (2024)

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning
di: Shao, Wei, et al.
Pubblicazione: (2024)

Switching Controller Synthesis for Hybrid Systems Against STL Formulas
di: Su, Han, et al.
Pubblicazione: (2024)

Noncooperative Game in Multi-controller System under Delayed and Asymmetric Information
di: Li, Xin, et al.
Pubblicazione: (2026)

Planning to Explore: Curiosity-Driven Planning for LLM Test Generation
di: Amayuelas, Alfonso, et al.
Pubblicazione: (2026)

Upside Down Reinforcement Learning with Policy Generators
di: Di Ventura, Jacopo, et al.
Pubblicazione: (2025)

Metalearning Continual Learning Algorithms
di: Irie, Kazuki, et al.
Pubblicazione: (2023)

Annotated History of Modern AI and Deep Learning
di: Schmidhuber, Juergen
Pubblicazione: (2022)

CreFlow: Corrective Reflow for Sparse-Reward Embodied Video Diffusion RL
di: Ni, Zhenyang, et al.
Pubblicazione: (2026)

Deep Learning: Our Miraculous Year 1990-1991
di: Schmidhuber, Juergen
Pubblicazione: (2020)

Exploring the Promise and Limits of Real-Time Recurrent Learning
di: Irie, Kazuki, et al.
Pubblicazione: (2023)

Interestingness as an Inductive Heuristic for Future Compression Progress
di: Herrmann, Vincent, et al.
Pubblicazione: (2026)

Learning Useful Representations of Recurrent Neural Network Weight Matrices
di: Herrmann, Vincent, et al.
Pubblicazione: (2024)

Self-Organising Neural Discrete Representation Learning à la Kohonen
di: Irie, Kazuki, et al.
Pubblicazione: (2023)

Fairness Overfitting in Machine Learning: An Information-Theoretic Perspective
di: Laakom, Firas, et al.
Pubblicazione: (2025)

Learning to Forget: Continual Learning with Adaptive Weight Decay
di: Ramesh, Aditya A., et al.
Pubblicazione: (2026)

Kinematics-aware Trajectory Generation and Prediction with Latent Stochastic Differential Modeling
di: Jiao, Ruochen, et al.
Pubblicazione: (2023)

When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards
di: Wang, Li, et al.
Pubblicazione: (2026)

Multi-Agent Pointer Transformer: Seq-to-Seq Reinforcement Learning for Multi-Vehicle Dynamic Pickup-Delivery Problems
di: Zou, Zengyu, et al.
Pubblicazione: (2025)

Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning
di: Wu, Qingyuan, et al.
Pubblicazione: (2025)

Multi-Path Collaborative Reasoning via Reinforcement Learning
di: Lv, Jindi, et al.
Pubblicazione: (2025)

Handling Delay in Real-Time Reinforcement Learning
di: Anokhin, Ivan, et al.
Pubblicazione: (2025)

Shedding Light on VLN Robustness: A Black-box Framework for Indoor Lighting-based Adversarial Attack
di: Li, Chenyang, et al.
Pubblicazione: (2025)

On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers
di: Štrupl, Miroslav, et al.
Pubblicazione: (2025)

RLTP: Reinforcement Learning to Pace for Delayed Impression Modeling in Preloaded Ads
di: Wei, Penghui, et al.
Pubblicazione: (2023)

Phonetic and Lexical Discovery of a Canine Language using HuBERT
di: Li, Xingyuan, et al.
Pubblicazione: (2024)