Saved in:
| Main Authors: | Diekhoff, Jan, Fischer, Jörn |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.15822 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Backward Learning for Goal-Conditioned Policies
by: Höftmann, Marc, et al.
Published: (2023)
by: Höftmann, Marc, et al.
Published: (2023)
LoopQ: Quantization for Recursive Transformers
by: Fang, Rui, et al.
Published: (2026)
by: Fang, Rui, et al.
Published: (2026)
Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching
by: Li, Xiang, et al.
Published: (2026)
by: Li, Xiang, et al.
Published: (2026)
Working Backwards: Learning to Place by Picking
by: Limoyo, Oliver, et al.
Published: (2023)
by: Limoyo, Oliver, et al.
Published: (2023)
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024)
by: Jain, Ayush, et al.
Published: (2024)
Thinking Forward and Backward: Effective Backward Planning with Large Language Models
by: Ren, Allen Z., et al.
Published: (2024)
by: Ren, Allen Z., et al.
Published: (2024)
NeoPhysIx: An Ultra Fast 3D Physical Simulator as Development Tool for AI Algorithms
by: Fischer, Jörn, et al.
Published: (2024)
by: Fischer, Jörn, et al.
Published: (2024)
Recursive Deep Inverse Reinforcement Learning
by: Ghanem, Paul, et al.
Published: (2025)
by: Ghanem, Paul, et al.
Published: (2025)
FractalBench: Diagnosing Visual-Mathematical Reasoning Through Recursive Program Synthesis
by: Ondras, Jan, et al.
Published: (2025)
by: Ondras, Jan, et al.
Published: (2025)
Your Learned Constraint is Secretly a Backward Reachable Tube
by: Qadri, Mohamad, et al.
Published: (2025)
by: Qadri, Mohamad, et al.
Published: (2025)
Using Forwards-Backwards Models to Approximate MDP Homomorphisms
by: Mavor-Parker, Augustine N., et al.
Published: (2022)
by: Mavor-Parker, Augustine N., et al.
Published: (2022)
GPT, But Backwards: Exactly Inverting Language Model Outputs
by: Skapars, Adrians, et al.
Published: (2025)
by: Skapars, Adrians, et al.
Published: (2025)
Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning
by: Vincent, Théo, et al.
Published: (2024)
by: Vincent, Théo, et al.
Published: (2024)
Recursive Learning-Based Virtual Buffering for Analytical Global Placement
by: Kahng, Andrew B., et al.
Published: (2025)
by: Kahng, Andrew B., et al.
Published: (2025)
Drift Q-Learning
by: Houssaini, Anas, et al.
Published: (2026)
by: Houssaini, Anas, et al.
Published: (2026)
Frictional Q-Learning
by: Kim, Hyunwoo, et al.
Published: (2025)
by: Kim, Hyunwoo, et al.
Published: (2025)
Flow Q-Learning
by: Park, Seohong, et al.
Published: (2025)
by: Park, Seohong, et al.
Published: (2025)
Scaling CrossQ with Weight Normalization
by: Palenicek, Daniel, et al.
Published: (2025)
by: Palenicek, Daniel, et al.
Published: (2025)
From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no Libraries
by: Silva, Ergon Cugler de Moraes
Published: (2024)
by: Silva, Ergon Cugler de Moraes
Published: (2024)
StableGrad: Backward Scale Control without Batch Normalization
by: Mestre, Jose I., et al.
Published: (2026)
by: Mestre, Jose I., et al.
Published: (2026)
Reinforcement Learning Assisted Recursive QAOA
by: Patel, Yash J., et al.
Published: (2022)
by: Patel, Yash J., et al.
Published: (2022)
Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning
by: Vincent, Théo, et al.
Published: (2024)
by: Vincent, Théo, et al.
Published: (2024)
Eau De $Q$-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning
by: Vincent, Théo, et al.
Published: (2025)
by: Vincent, Théo, et al.
Published: (2025)
Forward versus Backward: Comparing Reasoning Objectives in Direct Preference Optimization
by: Nikzad, Murtaza, et al.
Published: (2026)
by: Nikzad, Murtaza, et al.
Published: (2026)
Chunk-Guided Q-Learning
by: Song, Gwanwoo, et al.
Published: (2026)
by: Song, Gwanwoo, et al.
Published: (2026)
Periodic Regularized Q-Learning
by: Yang, Hyukjun, et al.
Published: (2026)
by: Yang, Hyukjun, et al.
Published: (2026)
Scalable In-Context Q-Learning
by: Liu, Jinmei, et al.
Published: (2025)
by: Liu, Jinmei, et al.
Published: (2025)
Learning Generalized Policies for Fully Observable Non-Deterministic Planning Domains
by: Hofmann, Till, et al.
Published: (2024)
by: Hofmann, Till, et al.
Published: (2024)
A Recursive Decomposition Framework for Causal Structure Learning in the Presence of Latent Variables
by: Li, Zheng, et al.
Published: (2026)
by: Li, Zheng, et al.
Published: (2026)
Expediting Reinforcement Learning by Incorporating Knowledge About Temporal Causality in the Environment
by: Corazza, Jan, et al.
Published: (2025)
by: Corazza, Jan, et al.
Published: (2025)
Modality-Decoupled Online Recursive Editing
by: Li, Siyuan, et al.
Published: (2026)
by: Li, Siyuan, et al.
Published: (2026)
Interaction Locality in Hierarchical Recursive Reasoning
by: Miyanishi, Yosuke, et al.
Published: (2026)
by: Miyanishi, Yosuke, et al.
Published: (2026)
Recursive Inference Machines for Neural Reasoning
by: Komisarczyk, Mieszko, et al.
Published: (2026)
by: Komisarczyk, Mieszko, et al.
Published: (2026)
Align Forward, Adapt Backward: Closing the Discretization Gap in Logic Gate Networks
by: Kim, Youngsung
Published: (2026)
by: Kim, Youngsung
Published: (2026)
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
by: Lee, Haanvid, et al.
Published: (2024)
by: Lee, Haanvid, et al.
Published: (2024)
Primal: A Unified Deterministic Framework for Quasi-Orthogonal Hashing and Manifold Learning
by: Khasia, Vladimer
Published: (2025)
by: Khasia, Vladimer
Published: (2025)
Graph Q-Learning for Combinatorial Optimization
by: Dax, Victoria M., et al.
Published: (2024)
by: Dax, Victoria M., et al.
Published: (2024)
Boosting Soft Q-Learning by Bounding
by: Adamczyk, Jacob, et al.
Published: (2024)
by: Adamczyk, Jacob, et al.
Published: (2024)
Spectral Alignment in Forward-Backward Representations via Temporal Abstraction
by: Azad, Seyed Mahdi B., et al.
Published: (2026)
by: Azad, Seyed Mahdi B., et al.
Published: (2026)
In-Context Compositional Q-Learning for Offline Reinforcement Learning
by: Xu, Qiushui, et al.
Published: (2025)
by: Xu, Qiushui, et al.
Published: (2025)
Similar Items
-
Backward Learning for Goal-Conditioned Policies
by: Höftmann, Marc, et al.
Published: (2023) -
LoopQ: Quantization for Recursive Transformers
by: Fang, Rui, et al.
Published: (2026) -
Q-MMR: Off-Policy Evaluation via Recursive Reweighting and Moment Matching
by: Li, Xiang, et al.
Published: (2026) -
Working Backwards: Learning to Place by Picking
by: Limoyo, Oliver, et al.
Published: (2023) -
Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions
by: Jain, Ayush, et al.
Published: (2024)