Saved in:
| Main Authors: | Mower, Christopher E., Wan, Yuhui, Yu, Hongzhan, Grosnit, Antoine, Gonzalez-Billandon, Jonas, Zimmer, Matthieu, Wang, Jinlong, Zhang, Xinyu, Zhao, Yao, Zhai, Anbang, Liu, Puze, Palenicek, Daniel, Tateo, Davide, Cadena, Cesar, Hutter, Marco, Peters, Jan, Tian, Guangjian, Zhuang, Yuzheng, Shao, Kun, Quan, Xingyue, Hao, Jianye, Wang, Jun, Bou-Ammar, Haitham |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.19741 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications
by: Liu, Puze, et al.
Published: (2024)
by: Liu, Puze, et al.
Published: (2024)
Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information
by: Tutnov, Rasul, et al.
Published: (2025)
by: Tutnov, Rasul, et al.
Published: (2025)
Al-Khwarizmi: Discovering Physical Laws with Foundation Models
by: Mower, Christopher E., et al.
Published: (2025)
by: Mower, Christopher E., et al.
Published: (2025)
Contextual Causal Bayesian Optimisation
by: Arsenyan, Vahan, et al.
Published: (2023)
by: Arsenyan, Vahan, et al.
Published: (2023)
Trust Region Inverse Reinforcement Learning: Explicit Dual Ascent using Local Policy Updates
by: Diwan, Anish, et al.
Published: (2026)
by: Diwan, Anish, et al.
Published: (2026)
Data-driven Interpretable Hybrid Robot Dynamics
by: Mower, Christopher E., et al.
Published: (2025)
by: Mower, Christopher E., et al.
Published: (2025)
Why Can Large Language Models Generate Correct Chain-of-Thoughts?
by: Tutunov, Rasul, et al.
Published: (2023)
by: Tutunov, Rasul, et al.
Published: (2023)
ShortCircuit: AlphaZero-Driven Circuit Design
by: Tsaras, Dimitrios, et al.
Published: (2024)
by: Tsaras, Dimitrios, et al.
Published: (2024)
Model-Based and Sample-Efficient AI-Assisted Math Discovery in Sphere Packing
by: Tutunov, Rasul, et al.
Published: (2025)
by: Tutunov, Rasul, et al.
Published: (2025)
A call for embodied AI
by: Paolo, Giuseppe, et al.
Published: (2024)
by: Paolo, Giuseppe, et al.
Published: (2024)
A Pragmatist Robot: Learning to Plan Tasks by Experiencing the Real World
by: Qu, Kaixian, et al.
Published: (2025)
by: Qu, Kaixian, et al.
Published: (2025)
Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers
by: Ji, Xiaotong, et al.
Published: (2026)
by: Ji, Xiaotong, et al.
Published: (2026)
Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening
by: Ji, Xiaotong, et al.
Published: (2026)
by: Ji, Xiaotong, et al.
Published: (2026)
Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective
by: Zimmer, Matthieu, et al.
Published: (2025)
by: Zimmer, Matthieu, et al.
Published: (2025)
Towards Safe Robot Foundation Models
by: Tölle, Maximilian, et al.
Published: (2025)
by: Tölle, Maximilian, et al.
Published: (2025)
The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling
by: Nguyen, Tu, et al.
Published: (2026)
by: Nguyen, Tu, et al.
Published: (2026)
Mixture of Attentions For Speculative Decoding
by: Zimmer, Matthieu, et al.
Published: (2024)
by: Zimmer, Matthieu, et al.
Published: (2024)
Risk-Controlled Lean-as-Judge for Natural-Language Mathematical Reasoning
by: Bourigault, Pauline, et al.
Published: (2026)
by: Bourigault, Pauline, et al.
Published: (2026)
The $\mathbf{Y}$-Combinator for LLMs: Solving Long-Context Rot with $λ$-Calculus
by: Roy, Amartya, et al.
Published: (2026)
by: Roy, Amartya, et al.
Published: (2026)
Towards Safe Robot Foundation Models Using Inductive Biases
by: Tölle, Maximilian, et al.
Published: (2025)
by: Tölle, Maximilian, et al.
Published: (2025)
Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning
by: Günster, Jonas, et al.
Published: (2024)
by: Günster, Jonas, et al.
Published: (2024)
Adaptive Control based Friction Estimation for Tracking Control of Robot Manipulators
by: Huang, Junning, et al.
Published: (2024)
by: Huang, Junning, et al.
Published: (2024)
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving
by: Zimmer, Matthieu, et al.
Published: (2025)
by: Zimmer, Matthieu, et al.
Published: (2025)
LBR-Stack: ROS 2 and Python Integration of KUKA FRI for Med and IIWA Robots
by: Huber, Martin, et al.
Published: (2023)
by: Huber, Martin, et al.
Published: (2023)
On Almost Surely Safe Alignment of Large Language Models at Inference-Time
by: Ji, Xiaotong, et al.
Published: (2025)
by: Ji, Xiaotong, et al.
Published: (2025)
Materiobiomodulated ROS Therapy for De Novo Hair Growth
by: Long Bai, et al.
Published: (2024)
by: Long Bai, et al.
Published: (2024)
Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
by: Oomerjee, Adnan, et al.
Published: (2025)
by: Oomerjee, Adnan, et al.
Published: (2025)
Ark: An Open-source Python-based Framework for Robot Learning
by: Dierking, Magnus, et al.
Published: (2025)
by: Dierking, Magnus, et al.
Published: (2025)
Distilling Contact Planning for Fast Trajectory Optimization in Robot Air Hockey
by: Jankowski, Julius, et al.
Published: (2024)
by: Jankowski, Julius, et al.
Published: (2024)
Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning
by: Kicki, Piotr, et al.
Published: (2024)
by: Kicki, Piotr, et al.
Published: (2024)
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
by: Christopoulou, Fenia, et al.
Published: (2024)
by: Christopoulou, Fenia, et al.
Published: (2024)
Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation
by: Fountas, Zafeirios, et al.
Published: (2026)
by: Fountas, Zafeirios, et al.
Published: (2026)
Subjective Depth and Timescale Transformers: Learning Where and When to Compute
by: Wieser, Frederico, et al.
Published: (2025)
by: Wieser, Frederico, et al.
Published: (2025)
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks
by: Ramesh, Shyam Sundhar, et al.
Published: (2026)
by: Ramesh, Shyam Sundhar, et al.
Published: (2026)
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
by: Cai, Xinyan, et al.
Published: (2025)
by: Cai, Xinyan, et al.
Published: (2025)
daleihao/RDycore_ROS: Codes and data for the RDycore-ROS study
by: Dalei Hao
Published: (2025)
by: Dalei Hao
Published: (2025)
ROS2swarm - A ROS 2 Package for Swarm Robot Behaviors
by: Kaiser, Tanja Katharina, et al.
Published: (2024)
by: Kaiser, Tanja Katharina, et al.
Published: (2024)
Untangling Component Imbalance in Hybrid Linear Attention Conversion Methods
by: Benfeghoul, Martin, et al.
Published: (2025)
by: Benfeghoul, Martin, et al.
Published: (2025)
Proxying ROS communications -- enabling containerized ROS deployments in distributed multi-host environments
by: Wendt, Arne, et al.
Published: (2022)
by: Wendt, Arne, et al.
Published: (2022)
Kolb-Based Experiential Learning for Generalist Agents with Human-Level Kaggle Data Science Performance
by: Grosnit, Antoine, et al.
Published: (2024)
by: Grosnit, Antoine, et al.
Published: (2024)
Similar Items
-
Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications
by: Liu, Puze, et al.
Published: (2024) -
Many of Your DPOs are Secretly One: Attempting Unification Through Mutual Information
by: Tutnov, Rasul, et al.
Published: (2025) -
Al-Khwarizmi: Discovering Physical Laws with Foundation Models
by: Mower, Christopher E., et al.
Published: (2025) -
Contextual Causal Bayesian Optimisation
by: Arsenyan, Vahan, et al.
Published: (2023) -
Trust Region Inverse Reinforcement Learning: Explicit Dual Ascent using Local Policy Updates
by: Diwan, Anish, et al.
Published: (2026)