Saved in:
| Main Authors: | Messa, Frederico, Pereira, André Grahl |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.19883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On-line Policy Improvement using Monte-Carlo Search
by: Tesauro, Gerald, et al.
Published: (2025)
by: Tesauro, Gerald, et al.
Published: (2025)
Statistical Analysis of Policy Space Compression Problem
by: Molaei, Majid, et al.
Published: (2024)
by: Molaei, Majid, et al.
Published: (2024)
From Parameters to Behaviors: Unsupervised Compression of the Policy Space
by: Tenedini, Davide, et al.
Published: (2025)
by: Tenedini, Davide, et al.
Published: (2025)
Hierarchical Task Network Planning with LLM-Generated Heuristics
by: Meneguzzi, Felipe, et al.
Published: (2026)
by: Meneguzzi, Felipe, et al.
Published: (2026)
Less Greedy Equivalence Search
by: Ejaz, Adiba, et al.
Published: (2025)
by: Ejaz, Adiba, et al.
Published: (2025)
Searching for Programmatic Policies in Semantic Spaces
by: Moraes, Rubens O., et al.
Published: (2024)
by: Moraes, Rubens O., et al.
Published: (2024)
Discovering State Equivalences in UCT Search Trees By Action Pruning
by: Schmöcker, Robin, et al.
Published: (2025)
by: Schmöcker, Robin, et al.
Published: (2025)
Combinatorial Optimization with Policy Adaptation using Latent Space Search
by: Chalumeau, Felix, et al.
Published: (2023)
by: Chalumeau, Felix, et al.
Published: (2023)
Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation
by: Mo, Zihao, et al.
Published: (2024)
by: Mo, Zihao, et al.
Published: (2024)
Reasoning Compression with Mixed-Policy Distillation
by: Yang, Han, et al.
Published: (2026)
by: Yang, Han, et al.
Published: (2026)
Data-Efficient Safe Policy Improvement Using Parametric Structure
by: Engelen, Kasper, et al.
Published: (2025)
by: Engelen, Kasper, et al.
Published: (2025)
Safe Explicable Policy Search
by: Hanni, Akkamahadevi, et al.
Published: (2025)
by: Hanni, Akkamahadevi, et al.
Published: (2025)
Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)
by: Yu, Tianrun, et al.
Published: (2026)
Blending Imitation and Reinforcement Learning for Robust Policy Improvement
by: Liu, Xuefeng, et al.
Published: (2023)
by: Liu, Xuefeng, et al.
Published: (2023)
Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)
by: Zhong, Victor, et al.
Published: (2024)
Accelerating Constrained Decoding with Token Space Compression
by: Sullivan, Michael, et al.
Published: (2026)
by: Sullivan, Michael, et al.
Published: (2026)
Optimization of Latent-Space Compression using Game-Theoretic Techniques for Transformer-Based Vector Search
by: Agrawal, Kushagra, et al.
Published: (2025)
by: Agrawal, Kushagra, et al.
Published: (2025)
Deep SPI: Safe Policy Improvement via World Models
by: Delgrange, Florent, et al.
Published: (2025)
by: Delgrange, Florent, et al.
Published: (2025)
Active Policy Improvement from Multiple Black-box Oracles
by: Liu, Xuefeng, et al.
Published: (2023)
by: Liu, Xuefeng, et al.
Published: (2023)
Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)
by: Queeney, James, et al.
Published: (2022)
Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
by: Lee, Chi-Chang, et al.
Published: (2025)
by: Lee, Chi-Chang, et al.
Published: (2025)
Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)
by: Liang, Haodong, et al.
Published: (2026)
Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation
by: Verdù, Federico Julian Camerota, et al.
Published: (2024)
by: Verdù, Federico Julian Camerota, et al.
Published: (2024)
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
by: Figliolia, Tomas, et al.
Published: (2025)
by: Figliolia, Tomas, et al.
Published: (2025)
Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models
by: Peysakhovich, Alexander, et al.
Published: (2026)
by: Peysakhovich, Alexander, et al.
Published: (2026)
SIME: Enhancing Policy Self-Improvement with Modal-level Exploration
by: Jin, Yang, et al.
Published: (2025)
by: Jin, Yang, et al.
Published: (2025)
Model Space Reasoning as Search in Feedback Space for Planning Domain Generation
by: Oswald, James, et al.
Published: (2026)
by: Oswald, James, et al.
Published: (2026)
Functional Equivalence with NARS
by: Johansson, Robert, et al.
Published: (2024)
by: Johansson, Robert, et al.
Published: (2024)
Intriguing Equivalence Structures of the Embedding Space of Vision Transformers
by: Salman, Shaeke, et al.
Published: (2024)
by: Salman, Shaeke, et al.
Published: (2024)
Searching Latent Program Spaces
by: Macfarlane, Matthew V, et al.
Published: (2024)
by: Macfarlane, Matthew V, et al.
Published: (2024)
Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning
by: Liu, Ziwen, et al.
Published: (2026)
by: Liu, Ziwen, et al.
Published: (2026)
Subgoal-Guided Policy Heuristic Search with Learned Subgoals
by: Tuero, Jake, et al.
Published: (2025)
by: Tuero, Jake, et al.
Published: (2025)
Agent-Driven Autonomous Reinforcement Learning Research: Iterative Policy Improvement for Quadruped Locomotion
by: Khandelwal, Nimesh, et al.
Published: (2026)
by: Khandelwal, Nimesh, et al.
Published: (2026)
J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization
by: Xu, Austin, et al.
Published: (2025)
by: Xu, Austin, et al.
Published: (2025)
Adaptive Compression of the Latent Space in Variational Autoencoders
by: Sejnova, Gabriela, et al.
Published: (2023)
by: Sejnova, Gabriela, et al.
Published: (2023)
Latent Reasoning in TRMs is Secretly a Policy Improvement Operator
by: Asadulaev, Arip, et al.
Published: (2025)
by: Asadulaev, Arip, et al.
Published: (2025)
Extracting Problem Structure with LLMs for Optimized SAT Local Search
by: Schidler, André, et al.
Published: (2025)
by: Schidler, André, et al.
Published: (2025)
Zero-shot Imitation Policy via Search in Demonstration Dataset
by: Malato, Federco, et al.
Published: (2024)
by: Malato, Federco, et al.
Published: (2024)
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
by: Liu, Shiyu, et al.
Published: (2026)
by: Liu, Shiyu, et al.
Published: (2026)
Study and Improvement of Search Algorithms in Multi-Player Perfect-Information Games
by: Cohen-Solal, Quentin
Published: (2026)
by: Cohen-Solal, Quentin
Published: (2026)
Similar Items
-
On-line Policy Improvement using Monte-Carlo Search
by: Tesauro, Gerald, et al.
Published: (2025) -
Statistical Analysis of Policy Space Compression Problem
by: Molaei, Majid, et al.
Published: (2024) -
From Parameters to Behaviors: Unsupervised Compression of the Policy Space
by: Tenedini, Davide, et al.
Published: (2025) -
Hierarchical Task Network Planning with LLM-Generated Heuristics
by: Meneguzzi, Felipe, et al.
Published: (2026) -
Less Greedy Equivalence Search
by: Ejaz, Adiba, et al.
Published: (2025)