Saved in:
| Main Authors: | Li, Beiming, Rozada, Sergio, Ribeiro, Alejandro |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.22350 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Matrix Low-Rank Approximation For Policy Gradient Methods
by: Rozada, Sergio, et al.
Published: (2024)
by: Rozada, Sergio, et al.
Published: (2024)
Matrix Low-Rank Trust Region Policy Optimization
by: Rozada, Sergio, et al.
Published: (2024)
by: Rozada, Sergio, et al.
Published: (2024)
Tensor Low-rank Approximation of Finite-horizon Value Functions
by: Rozada, Sergio, et al.
Published: (2024)
by: Rozada, Sergio, et al.
Published: (2024)
Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning
by: Rozada, Sergio, et al.
Published: (2022)
by: Rozada, Sergio, et al.
Published: (2022)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
by: Kobalczyk, Katarzyna, et al.
Published: (2024)
by: Kobalczyk, Katarzyna, et al.
Published: (2024)
Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
by: Trivedi, Rakshit, et al.
Published: (2026)
by: Trivedi, Rakshit, et al.
Published: (2026)
Deterministic Policy Gradient Primal-Dual Methods for Continuous-Space Constrained MDPs
by: Rozada, Sergio, et al.
Published: (2024)
by: Rozada, Sergio, et al.
Published: (2024)
Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning
by: Wang, Kaiwen, et al.
Published: (2024)
by: Wang, Kaiwen, et al.
Published: (2024)
Clifford-Steerable Convolutional Neural Networks
by: Zhdanov, Maksim, et al.
Published: (2024)
by: Zhdanov, Maksim, et al.
Published: (2024)
Graph-Aware Diffusion for Signal Generation
by: Rozada, Sergio, et al.
Published: (2025)
by: Rozada, Sergio, et al.
Published: (2025)
Conditional Clifford-Steerable CNNs with Complete Kernel Basis for PDE Modeling
by: Szarvas, Bálint László, et al.
Published: (2025)
by: Szarvas, Bálint László, et al.
Published: (2025)
Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
by: Guo, Hanze, et al.
Published: (2025)
by: Guo, Hanze, et al.
Published: (2025)
Deep Causal Behavioral Policy Learning: Applications to Healthcare
by: Knecht, Jonas, et al.
Published: (2025)
by: Knecht, Jonas, et al.
Published: (2025)
Diversifying Policy Behaviors with Extrinsic Behavioral Curiosity
by: Wan, Zhenglin, et al.
Published: (2024)
by: Wan, Zhenglin, et al.
Published: (2024)
Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2025)
by: Gao, Chen-Xiao, et al.
Published: (2025)
Cross-Lingual Prompt Steerability: Towards Accurate and Robust LLM Behavior across Languages
by: Zhang, Lechen, et al.
Published: (2025)
by: Zhang, Lechen, et al.
Published: (2025)
Exploration Behavior of Untrained Policies
by: Adamczyk, Jacob
Published: (2025)
by: Adamczyk, Jacob
Published: (2025)
Skill Learning via Policy Diversity Yields Identifiable Representations for Reinforcement Learning
by: Reizinger, Patrik, et al.
Published: (2025)
by: Reizinger, Patrik, et al.
Published: (2025)
Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)
by: Macuglia, Maël, et al.
Published: (2025)
Unsupervised Discovery of Steerable Factors When Graph Deep Generative Models Are Entangled
by: Liu, Shengchao, et al.
Published: (2024)
by: Liu, Shengchao, et al.
Published: (2024)
Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation
by: Jia, Chengxing, et al.
Published: (2024)
by: Jia, Chengxing, et al.
Published: (2024)
Curricula for Learning Robust Policies with Factored State Representations in Changing Environments
by: Panayiotou, Panayiotis, et al.
Published: (2024)
by: Panayiotou, Panayiotis, et al.
Published: (2024)
Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection
by: Sun, Li, et al.
Published: (2026)
by: Sun, Li, et al.
Published: (2026)
Symmetric Behavior Regularized Policy Optimization
by: Zhu, Lingwei, et al.
Published: (2025)
by: Zhu, Lingwei, et al.
Published: (2025)
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
by: Lyu, Jiafei, et al.
Published: (2024)
by: Lyu, Jiafei, et al.
Published: (2024)
Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning
by: Rozada, Sergio, et al.
Published: (2025)
by: Rozada, Sergio, et al.
Published: (2025)
Temporal Representations for Exploration: Learning Complex Exploratory Behavior without Extrinsic Rewards
by: Mohamed, Faisal, et al.
Published: (2026)
by: Mohamed, Faisal, et al.
Published: (2026)
VAMOS: A Hierarchical Vision-Language-Action Model for Capability-Modulated and Steerable Navigation
by: Castro, Mateo Guaman, et al.
Published: (2025)
by: Castro, Mateo Guaman, et al.
Published: (2025)
Trust-Region Behavior Blending for On-Policy Distillation
by: Plyusov, Daniil, et al.
Published: (2026)
by: Plyusov, Daniil, et al.
Published: (2026)
Policy Optimization for Personalized Interventions in Behavioral Health
by: Baek, Jackie, et al.
Published: (2023)
by: Baek, Jackie, et al.
Published: (2023)
Integrating Time Series into LLMs via Multi-layer Steerable Embedding Fusion for Enhanced Forecasting
by: Chen, Zhuomin, et al.
Published: (2025)
by: Chen, Zhuomin, et al.
Published: (2025)
Behavior-Invariant Task Representation Learning with Transformer-based World Models for Offline Meta-Reinforcement Learning
by: Qian, Fuyuan, et al.
Published: (2026)
by: Qian, Fuyuan, et al.
Published: (2026)
Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia
by: Cui, Jin, et al.
Published: (2025)
by: Cui, Jin, et al.
Published: (2025)
Foundation Policies with Hilbert Representations
by: Park, Seohong, et al.
Published: (2024)
by: Park, Seohong, et al.
Published: (2024)
Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention
by: Kori, Avinash, et al.
Published: (2024)
by: Kori, Avinash, et al.
Published: (2024)
Tool Calling is Linearly Readable and Steerable in Language Models
by: Wu, Zekun, et al.
Published: (2026)
by: Wu, Zekun, et al.
Published: (2026)
ACT-JEPA: Novel Joint-Embedding Predictive Architecture for Efficient Policy Representation Learning
by: Vujinovic, Aleksandar, et al.
Published: (2025)
by: Vujinovic, Aleksandar, et al.
Published: (2025)
Unsupervised Behavioral Compression: Learning Low-Dimensional Policy Manifolds through State-Occupancy Matching
by: Fraschini, Andrea, et al.
Published: (2026)
by: Fraschini, Andrea, et al.
Published: (2026)
Discovering Behavioral Modes in Deep Reinforcement Learning Policies Using Trajectory Clustering in Latent Space
by: Remman, Sindre Benjamin, et al.
Published: (2024)
by: Remman, Sindre Benjamin, et al.
Published: (2024)
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
by: Psenka, Michael, et al.
Published: (2023)
by: Psenka, Michael, et al.
Published: (2023)
Similar Items
-
Matrix Low-Rank Approximation For Policy Gradient Methods
by: Rozada, Sergio, et al.
Published: (2024) -
Matrix Low-Rank Trust Region Policy Optimization
by: Rozada, Sergio, et al.
Published: (2024) -
Tensor Low-rank Approximation of Finite-horizon Value Functions
by: Rozada, Sergio, et al.
Published: (2024) -
Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning
by: Rozada, Sergio, et al.
Published: (2022) -
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes
by: Kobalczyk, Katarzyna, et al.
Published: (2024)