Saved in:
| Main Authors: | Sinha, Amit, Geist, Matthieu, Mahajan, Aditya |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.06121 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Convergence of regularized agent-state-based Q-learning in POMDPs
by: Sinha, Amit, et al.
Published: (2025)
by: Sinha, Amit, et al.
Published: (2025)
Risk-seeking conservative policy iteration with agent-state based policies for Dec-POMDPs with guaranteed convergence
by: Sinha, Amit, et al.
Published: (2026)
by: Sinha, Amit, et al.
Published: (2026)
Agent-state based policies in POMDPs: Beyond belief-state MDPs
by: Sinha, Amit, et al.
Published: (2024)
by: Sinha, Amit, et al.
Published: (2024)
Multi-agent imitation learning with function approximation: Linear Markov games and beyond
by: Viano, Luca, et al.
Published: (2026)
by: Viano, Luca, et al.
Published: (2026)
Towards Minimax Optimality of Model-based Robust Reinforcement Learning
by: Clavier, Pierre, et al.
Published: (2023)
by: Clavier, Pierre, et al.
Published: (2023)
Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs
by: Han, Chao, et al.
Published: (2024)
by: Han, Chao, et al.
Published: (2024)
Solving robust MDPs as a sequence of static RL problems
by: Zouitine, Adil, et al.
Published: (2024)
by: Zouitine, Adil, et al.
Published: (2024)
Rate optimal learning of equilibria from data
by: Freihaut, Till, et al.
Published: (2025)
by: Freihaut, Till, et al.
Published: (2025)
Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View
by: Ghugare, Raj, et al.
Published: (2024)
by: Ghugare, Raj, et al.
Published: (2024)
Soft $Q(λ)$: A multi-step off-policy method for entropy regularised reinforcement learning using eligibility traces
by: Mahajan, Pranav, et al.
Published: (2026)
by: Mahajan, Pranav, et al.
Published: (2026)
RRLS : Robust Reinforcement Learning Suite
by: Zouitine, Adil, et al.
Published: (2024)
by: Zouitine, Adil, et al.
Published: (2024)
Time-Constrained Robust MDPs
by: Zouitine, Adil, et al.
Published: (2024)
by: Zouitine, Adil, et al.
Published: (2024)
Learning Equilibria from Data: Provably Efficient Multi-Agent Imitation Learning
by: Freihaut, Till, et al.
Published: (2025)
by: Freihaut, Till, et al.
Published: (2025)
ShiQ: Bringing back Bellman to LLMs
by: Clavier, Pierre, et al.
Published: (2025)
by: Clavier, Pierre, et al.
Published: (2025)
Bootstrapping Expectiles in Reinforcement Learning
by: Clavier, Pierre, et al.
Published: (2024)
by: Clavier, Pierre, et al.
Published: (2024)
Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space
by: Orsula, Andrej, et al.
Published: (2024)
by: Orsula, Andrej, et al.
Published: (2024)
Space Robotics Bench: Robot Learning Beyond Earth
by: Orsula, Andrej, et al.
Published: (2025)
by: Orsula, Andrej, et al.
Published: (2025)
Learning Tool-Aware Adaptive Compliant Control for Autonomous Regolith Excavation
by: Orsula, Andrej, et al.
Published: (2025)
by: Orsula, Andrej, et al.
Published: (2025)
Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media
by: Orsula, Andrej, et al.
Published: (2025)
by: Orsula, Andrej, et al.
Published: (2025)
A Theoretical Justification for Asymmetric Actor-Critic Algorithms
by: Lambrechts, Gaspard, et al.
Published: (2025)
by: Lambrechts, Gaspard, et al.
Published: (2025)
BanditQ: Fair Bandits with Guaranteed Rewards
by: Sinha, Abhishek
Published: (2023)
by: Sinha, Abhishek
Published: (2023)
A Q-learning Approach for Adherence-Aware Recommendations
by: Faros, Ioannis, et al.
Published: (2023)
by: Faros, Ioannis, et al.
Published: (2023)
Memoryless Policy Iteration for Episodic POMDPs
by: van Zuijlen, Roy, et al.
Published: (2025)
by: van Zuijlen, Roy, et al.
Published: (2025)
Rethinking Transformers in Solving POMDPs
by: Lu, Chenhao, et al.
Published: (2024)
by: Lu, Chenhao, et al.
Published: (2024)
Self-Improving Robust Preference Optimization
by: Choi, Eugene, et al.
Published: (2024)
by: Choi, Eugene, et al.
Published: (2024)
Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
by: Shi, Zhengyan, et al.
Published: (2024)
by: Shi, Zhengyan, et al.
Published: (2024)
Perception-Based Beliefs for POMDPs with Visual Observations
by: Schäfers, Miriam, et al.
Published: (2026)
by: Schäfers, Miriam, et al.
Published: (2026)
Recurrent Natural Policy Gradient for POMDPs
by: Cayci, Semih, et al.
Published: (2024)
by: Cayci, Semih, et al.
Published: (2024)
Online Planning in POMDPs with State-Requests
by: Avalos, Raphael, et al.
Published: (2024)
by: Avalos, Raphael, et al.
Published: (2024)
Deep Q-Network (DQN) multi-agent reinforcement learning (MARL) for Stock Trading
by: Tidwell, John Christopher, et al.
Published: (2025)
by: Tidwell, John Christopher, et al.
Published: (2025)
Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement Learning
by: Wu, Zida, et al.
Published: (2025)
by: Wu, Zida, et al.
Published: (2025)
Concentration of Cumulative Reward in Markov Decision Processes
by: Sayedana, Borna, et al.
Published: (2024)
by: Sayedana, Borna, et al.
Published: (2024)
Scaling Internal-State Policy-Gradient Methods for POMDPs
by: Aberdeen, Douglas, et al.
Published: (2025)
by: Aberdeen, Douglas, et al.
Published: (2025)
Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games
by: Magnino, Lorenzo, et al.
Published: (2026)
by: Magnino, Lorenzo, et al.
Published: (2026)
Periodic Regularized Q-Learning
by: Yang, Hyukjun, et al.
Published: (2026)
by: Yang, Hyukjun, et al.
Published: (2026)
Pseudo-rigid body networks: learning interpretable deformable object dynamics from partial observations
by: Mamedov, Shamil, et al.
Published: (2023)
by: Mamedov, Shamil, et al.
Published: (2023)
Posterior Sampling-based Online Learning for Episodic POMDPs
by: Tang, Dengwang, et al.
Published: (2023)
by: Tang, Dengwang, et al.
Published: (2023)
Pessimistic Iterative Planning with RNNs for Robust POMDPs
by: Galesloot, Maris F. L., et al.
Published: (2024)
by: Galesloot, Maris F. L., et al.
Published: (2024)
Scalable Policy-Based RL Algorithms for POMDPs
by: Anjarlekar, Ameya, et al.
Published: (2025)
by: Anjarlekar, Ameya, et al.
Published: (2025)
Approximate Control for Continuous-Time POMDPs
by: Eich, Yannick, et al.
Published: (2024)
by: Eich, Yannick, et al.
Published: (2024)
Similar Items
-
Convergence of regularized agent-state-based Q-learning in POMDPs
by: Sinha, Amit, et al.
Published: (2025) -
Risk-seeking conservative policy iteration with agent-state based policies for Dec-POMDPs with guaranteed convergence
by: Sinha, Amit, et al.
Published: (2026) -
Agent-state based policies in POMDPs: Beyond belief-state MDPs
by: Sinha, Amit, et al.
Published: (2024) -
Multi-agent imitation learning with function approximation: Linear Markov games and beyond
by: Viano, Luca, et al.
Published: (2026) -
Towards Minimax Optimality of Model-based Robust Reinforcement Learning
by: Clavier, Pierre, et al.
Published: (2023)