I tiakina i:
| Ngā kaituhi matua: | McInroe, Trevor, Jelley, Adam, Albrecht, Stefano V., Storkey, Amos |
|---|---|
| Hōputu: | Preprint |
| I whakaputaina: |
2023
|
| Ngā marau: | |
| Urunga tuihono: | https://arxiv.org/abs/2310.05723 |
| Ngā Tūtohu: |
Tāpirihia he Tūtohu
Kāore He Tūtohu, Me noho koe te mea tuatahi ki te tūtohu i tēnei pūkete!
|
Ngā tūemi rite
Efficient Offline Reinforcement Learning: First Imitate, then Improve
mā: Jelley, Adam, me ētahi atu.
I whakaputaina: (2024)
mā: Jelley, Adam, me ētahi atu.
I whakaputaina: (2024)
Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning
mā: Zhang, Weipu, me ētahi atu.
I whakaputaina: (2025)
mā: Zhang, Weipu, me ētahi atu.
I whakaputaina: (2025)
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
mā: Han, Dongge, me ētahi atu.
I whakaputaina: (2024)
mā: Han, Dongge, me ētahi atu.
I whakaputaina: (2024)
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2022)
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2022)
PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2025)
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2025)
Forgetting is Everywhere
mā: Sanati, Ben, me ētahi atu.
I whakaputaina: (2025)
mā: Sanati, Ben, me ētahi atu.
I whakaputaina: (2025)
Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning
mā: Garcin, Samuel, me ētahi atu.
I whakaputaina: (2025)
mā: Garcin, Samuel, me ētahi atu.
I whakaputaina: (2025)
Enhancing Tactile-based Reinforcement Learning for Robotic Control
mā: Miller, Elle, me ētahi atu.
I whakaputaina: (2025)
mā: Miller, Elle, me ētahi atu.
I whakaputaina: (2025)
Terra Nova: A Comprehensive Challenge Environment for Intelligent Agents
mā: McInroe, Trevor
I whakaputaina: (2025)
mā: McInroe, Trevor
I whakaputaina: (2025)
CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning
mā: Hedman, Marcel, me ētahi atu.
I whakaputaina: (2026)
mā: Hedman, Marcel, me ētahi atu.
I whakaputaina: (2026)
Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics
mā: Hinckeldey, Leonard, me ētahi atu.
I whakaputaina: (2025)
mā: Hinckeldey, Leonard, me ētahi atu.
I whakaputaina: (2025)
Aligning Agents like Large Language Models
mā: Jelley, Adam, me ētahi atu.
I whakaputaina: (2024)
mā: Jelley, Adam, me ētahi atu.
I whakaputaina: (2024)
Chunking: Continual Learning is not just about Distribution Shift
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2023)
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2023)
Rationality Measurement and Theory for Reinforcement Learning Agents
mā: Qian, Kejiang, me ētahi atu.
I whakaputaina: (2026)
mā: Qian, Kejiang, me ētahi atu.
I whakaputaina: (2026)
Diffusion for World Modeling: Visual Details Matter in Atari
mā: Alonso, Eloi, me ētahi atu.
I whakaputaina: (2024)
mā: Alonso, Eloi, me ētahi atu.
I whakaputaina: (2024)
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
mā: Tessera, Kale-ab Abebe, me ētahi atu.
I whakaputaina: (2024)
mā: Tessera, Kale-ab Abebe, me ētahi atu.
I whakaputaina: (2024)
roto 2.0: The Robot Tactile Olympiad
mā: Miller, Elle, me ētahi atu.
I whakaputaina: (2026)
mā: Miller, Elle, me ētahi atu.
I whakaputaina: (2026)
Label Noise: Correcting the Forward-Correction
mā: Toner, William, me ētahi atu.
I whakaputaina: (2023)
mā: Toner, William, me ētahi atu.
I whakaputaina: (2023)
Noisy Early Stopping for Noisy Labels
mā: Toner, William, me ētahi atu.
I whakaputaina: (2024)
mā: Toner, William, me ētahi atu.
I whakaputaina: (2024)
Approximate Bayesian Class-Conditional Models under Continuous Representation Shift
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2023)
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2023)
Adversarial robustness of VAEs through the lens of local geometry
mā: Khan, Asif, me ētahi atu.
I whakaputaina: (2022)
mā: Khan, Asif, me ētahi atu.
I whakaputaina: (2022)
Using Offline Data to Speed Up Reinforcement Learning in Procedurally Generated Environments
mā: Andres, Alain, me ētahi atu.
I whakaputaina: (2023)
mā: Andres, Alain, me ētahi atu.
I whakaputaina: (2023)
Adapting Time Series Foundation Models through Data Mixtures
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2026)
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2026)
Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering
mā: Gupta, Akash, me ētahi atu.
I whakaputaina: (2025)
mā: Gupta, Akash, me ētahi atu.
I whakaputaina: (2025)
Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning
mā: Ada, Suzan Ece, me ētahi atu.
I whakaputaina: (2023)
mā: Ada, Suzan Ece, me ētahi atu.
I whakaputaina: (2023)
Few-Shot Learning with Class Imbalance
mā: Ochal, Mateusz, me ētahi atu.
I whakaputaina: (2021)
mā: Ochal, Mateusz, me ētahi atu.
I whakaputaina: (2021)
Online Pre-Training for Offline-to-Online Reinforcement Learning
mā: Shin, Yongjae, me ētahi atu.
I whakaputaina: (2025)
mā: Shin, Yongjae, me ētahi atu.
I whakaputaina: (2025)
Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras
mā: Dunion, Mhairi, me ētahi atu.
I whakaputaina: (2024)
mā: Dunion, Mhairi, me ētahi atu.
I whakaputaina: (2024)
Hyperparameter Selection in Continual Learning
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2024)
mā: Lee, Thomas L., me ētahi atu.
I whakaputaina: (2024)
Information-Directed Offline-to-Online Reinforcement Learning
mā: Chen, Keru
I whakaputaina: (2026)
mā: Chen, Keru
I whakaputaina: (2026)
Model-Free Robust $ϕ$-Divergence Reinforcement Learning Using Both Offline and Online Data
mā: Panaganti, Kishan, me ētahi atu.
I whakaputaina: (2024)
mā: Panaganti, Kishan, me ētahi atu.
I whakaputaina: (2024)
Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning
mā: Wang, Qingjun, me ētahi atu.
I whakaputaina: (2026)
mā: Wang, Qingjun, me ētahi atu.
I whakaputaina: (2026)
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
mā: Hu, Hao, me ētahi atu.
I whakaputaina: (2024)
mā: Hu, Hao, me ētahi atu.
I whakaputaina: (2024)
Online Optimization for Offline Safe Reinforcement Learning
mā: Chemingui, Yassine, me ētahi atu.
I whakaputaina: (2025)
mā: Chemingui, Yassine, me ētahi atu.
I whakaputaina: (2025)
The Three Regimes of Offline-to-Online Reinforcement Learning
mā: Li, Lu, me ētahi atu.
I whakaputaina: (2025)
mā: Li, Lu, me ētahi atu.
I whakaputaina: (2025)
Prior-Guided Diffusion Planning for Offline Reinforcement Learning
mā: Ki, Donghyeon, me ētahi atu.
I whakaputaina: (2025)
mā: Ki, Donghyeon, me ētahi atu.
I whakaputaina: (2025)
Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning
mā: Wang, Qi, me ētahi atu.
I whakaputaina: (2023)
mā: Wang, Qi, me ētahi atu.
I whakaputaina: (2023)
Adaptive Q-Chunking for Offline-to-Online Reinforcement Learning
mā: Gireesh, Nandiraju, me ētahi atu.
I whakaputaina: (2026)
mā: Gireesh, Nandiraju, me ētahi atu.
I whakaputaina: (2026)
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
mā: Liu, Xu-Hui, me ētahi atu.
I whakaputaina: (2024)
mā: Liu, Xu-Hui, me ētahi atu.
I whakaputaina: (2024)
Active Advantage-Aligned Online Reinforcement Learning with Offline Data
mā: Liu, Xuefeng, me ētahi atu.
I whakaputaina: (2025)
mā: Liu, Xuefeng, me ētahi atu.
I whakaputaina: (2025)
Ngā tūemi rite
-
Efficient Offline Reinforcement Learning: First Imitate, then Improve
mā: Jelley, Adam, me ētahi atu.
I whakaputaina: (2024) -
Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning
mā: Zhang, Weipu, me ētahi atu.
I whakaputaina: (2025) -
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
mā: Han, Dongge, me ētahi atu.
I whakaputaina: (2024) -
Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2022) -
PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU
mā: McInroe, Trevor, me ētahi atu.
I whakaputaina: (2025)