Na minha lista:
| Autor principal: | Park, Jongchan |
|---|---|
| Formato: | Preprint |
| Publicado em: |
2026
|
| Assuntos: | |
| Acesso em linha: | https://arxiv.org/abs/2605.21557 |
| Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Registros relacionados
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
por: Park, Jongchan, et al.
Publicado em: (2025)
por: Park, Jongchan, et al.
Publicado em: (2025)
Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning
por: Baek, Seungho, et al.
Publicado em: (2025)
por: Baek, Seungho, et al.
Publicado em: (2025)
Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
por: Palenicek, Daniel, et al.
Publicado em: (2025)
por: Palenicek, Daniel, et al.
Publicado em: (2025)
Adaptive Policy Synchronization for Scalable Reinforcement Learning
por: Lafuente-Mercado, Rodney
Publicado em: (2025)
por: Lafuente-Mercado, Rodney
Publicado em: (2025)
Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search
por: Najib, Amna, et al.
Publicado em: (2024)
por: Najib, Amna, et al.
Publicado em: (2024)
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity
por: Johnson, Emmeran, et al.
Publicado em: (2023)
por: Johnson, Emmeran, et al.
Publicado em: (2023)
Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training
por: Fakoor, Rasool, et al.
Publicado em: (2026)
por: Fakoor, Rasool, et al.
Publicado em: (2026)
Adaptive Policy Backbone via Shared Network
por: Park, Bumgeun, et al.
Publicado em: (2025)
por: Park, Bumgeun, et al.
Publicado em: (2025)
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
por: Cherukuri, Kalyan, et al.
Publicado em: (2025)
por: Cherukuri, Kalyan, et al.
Publicado em: (2025)
Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control
por: De Monte, Riccardo, et al.
Publicado em: (2026)
por: De Monte, Riccardo, et al.
Publicado em: (2026)
Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data
por: Madhow, Sunil, et al.
Publicado em: (2023)
por: Madhow, Sunil, et al.
Publicado em: (2023)
Constraint-Adaptive Policy Switching for Offline Safe Reinforcement Learning
por: Chemingui, Yassine, et al.
Publicado em: (2024)
por: Chemingui, Yassine, et al.
Publicado em: (2024)
Adaptive Replay Buffer for Offline-to-Online Reinforcement Learning
por: Song, Chihyeon, et al.
Publicado em: (2025)
por: Song, Chihyeon, et al.
Publicado em: (2025)
Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
por: Hong, Ilgee, et al.
Publicado em: (2024)
por: Hong, Ilgee, et al.
Publicado em: (2024)
Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning
por: Yang, Bangji, et al.
Publicado em: (2026)
por: Yang, Bangji, et al.
Publicado em: (2026)
Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning
por: Shitanda, Naoki, et al.
Publicado em: (2026)
por: Shitanda, Naoki, et al.
Publicado em: (2026)
Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm
por: Lim, Yooseok, et al.
Publicado em: (2024)
por: Lim, Yooseok, et al.
Publicado em: (2024)
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping
por: Xi, Zhiheng, et al.
Publicado em: (2025)
por: Xi, Zhiheng, et al.
Publicado em: (2025)
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
por: Liu, Tenglong, et al.
Publicado em: (2024)
por: Liu, Tenglong, et al.
Publicado em: (2024)
Adaptive Batch Sizes Using Non-Euclidean Gradient Noise Scales for Stochastic Sign and Spectral Descent
por: Naganuma, Hiroki, et al.
Publicado em: (2026)
por: Naganuma, Hiroki, et al.
Publicado em: (2026)
Reinforcement Learning with Curriculum-inspired Adaptive Direct Policy Guidance for Truck Dispatching
por: Meng, Shi, et al.
Publicado em: (2025)
por: Meng, Shi, et al.
Publicado em: (2025)
Explainable Reinforcement Learning via Temporal Policy Decomposition
por: Ruggeri, Franco, et al.
Publicado em: (2025)
por: Ruggeri, Franco, et al.
Publicado em: (2025)
Batch Bayesian Active Learning with Partial Batch Label Sampling
por: Hu, Kangping, et al.
Publicado em: (2025)
por: Hu, Kangping, et al.
Publicado em: (2025)
Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation
por: Zhang, Zhenshuo, et al.
Publicado em: (2025)
por: Zhang, Zhenshuo, et al.
Publicado em: (2025)
Reasoner for Real-World Event Detection: Scaling Reinforcement Learning via Adaptive Perplexity-Aware Sampling Strategy
por: Zhang, Xiaoyun, et al.
Publicado em: (2025)
por: Zhang, Xiaoyun, et al.
Publicado em: (2025)
Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying
por: Nishimori, Soichiro, et al.
Publicado em: (2026)
por: Nishimori, Soichiro, et al.
Publicado em: (2026)
HALO: Hierarchical Reinforcement Learning for Large-Scale Adaptive Traffic Signal Control
por: Zhu, Yaqiao, et al.
Publicado em: (2025)
por: Zhu, Yaqiao, et al.
Publicado em: (2025)
Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity
por: Shrestha, Susav, et al.
Publicado em: (2025)
por: Shrestha, Susav, et al.
Publicado em: (2025)
Imitating Language via Scalable Inverse Reinforcement Learning
por: Wulfmeier, Markus, et al.
Publicado em: (2024)
por: Wulfmeier, Markus, et al.
Publicado em: (2024)
Demystifying Design Choices of Reinforcement Fine-tuning: A Batched Contextual Bandit Learning Perspective
por: Xie, Hong, et al.
Publicado em: (2026)
por: Xie, Hong, et al.
Publicado em: (2026)
Skill Learning via Policy Diversity Yields Identifiable Representations for Reinforcement Learning
por: Reizinger, Patrik, et al.
Publicado em: (2025)
por: Reizinger, Patrik, et al.
Publicado em: (2025)
Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement Learning
por: Bozkurt, Alper Kamil, et al.
Publicado em: (2026)
por: Bozkurt, Alper Kamil, et al.
Publicado em: (2026)
Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization
por: Huang, Zixuan, et al.
Publicado em: (2025)
por: Huang, Zixuan, et al.
Publicado em: (2025)
Deep Reinforcement Learning with Task-Adaptive Retrieval via Hypernetwork
por: Jin, Yonggang, et al.
Publicado em: (2023)
por: Jin, Yonggang, et al.
Publicado em: (2023)
Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning
por: Pravetz, Thomas
Publicado em: (2026)
por: Pravetz, Thomas
Publicado em: (2026)
Towards Fast Safe Online Reinforcement Learning via Policy Finetuning
por: Chen, Keru, et al.
Publicado em: (2024)
por: Chen, Keru, et al.
Publicado em: (2024)
Efficient On-Policy Reinforcement Learning via Exploration of Sparse Parameter Space
por: Zhang, Xinyu, et al.
Publicado em: (2025)
por: Zhang, Xinyu, et al.
Publicado em: (2025)
Overcoming Overfitting in Reinforcement Learning via Gaussian Process Diffusion Policy
por: Horprasert, Amornyos, et al.
Publicado em: (2025)
por: Horprasert, Amornyos, et al.
Publicado em: (2025)
SPECTra: Scalable Multi-Agent Reinforcement Learning with Permutation-Free Networks
por: Park, Hyunwoo, et al.
Publicado em: (2025)
por: Park, Hyunwoo, et al.
Publicado em: (2025)
StableGrad: Backward Scale Control without Batch Normalization
por: Mestre, Jose I., et al.
Publicado em: (2026)
por: Mestre, Jose I., et al.
Publicado em: (2026)
Registros relacionados
-
Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning
por: Park, Jongchan, et al.
Publicado em: (2025) -
Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning
por: Baek, Seungho, et al.
Publicado em: (2025) -
Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
por: Palenicek, Daniel, et al.
Publicado em: (2025) -
Adaptive Policy Synchronization for Scalable Reinforcement Learning
por: Lafuente-Mercado, Rodney
Publicado em: (2025) -
Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search
por: Najib, Amna, et al.
Publicado em: (2024)