:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Messa, Frederico, Pereira, André Grahl
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.19883
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On-line Policy Improvement using Monte-Carlo Search
by: Tesauro, Gerald, et al.
Published: (2025)

Statistical Analysis of Policy Space Compression Problem
by: Molaei, Majid, et al.
Published: (2024)

From Parameters to Behaviors: Unsupervised Compression of the Policy Space
by: Tenedini, Davide, et al.
Published: (2025)

Hierarchical Task Network Planning with LLM-Generated Heuristics
by: Meneguzzi, Felipe, et al.
Published: (2026)

Less Greedy Equivalence Search
by: Ejaz, Adiba, et al.
Published: (2025)

Searching for Programmatic Policies in Semantic Spaces
by: Moraes, Rubens O., et al.
Published: (2024)

Discovering State Equivalences in UCT Search Trees By Action Pruning
by: Schmöcker, Robin, et al.
Published: (2025)

Combinatorial Optimization with Policy Adaptation using Latent Space Search
by: Chalumeau, Felix, et al.
Published: (2023)

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation
by: Mo, Zihao, et al.
Published: (2024)

Reasoning Compression with Mixed-Policy Distillation
by: Yang, Han, et al.
Published: (2026)

Data-Efficient Safe Policy Improvement Using Parametric Structure
by: Engelen, Kasper, et al.
Published: (2025)

Safe Explicable Policy Search
by: Hanni, Akkamahadevi, et al.
Published: (2025)

Provable and Practical In-Context Policy Optimization for Self-Improvement
by: Yu, Tianrun, et al.
Published: (2026)

Blending Imitation and Reinforcement Learning for Robust Policy Improvement
by: Liu, Xuefeng, et al.
Published: (2023)

Policy Improvement using Language Feedback Models
by: Zhong, Victor, et al.
Published: (2024)

Accelerating Constrained Decoding with Token Space Compression
by: Sullivan, Michael, et al.
Published: (2026)

Optimization of Latent-Space Compression using Game-Theoretic Techniques for Transformer-Based Vector Search
by: Agrawal, Kushagra, et al.
Published: (2025)

Deep SPI: Safe Policy Improvement via World Models
by: Delgrange, Florent, et al.
Published: (2025)

Active Policy Improvement from Multiple Black-box Oracles
by: Liu, Xuefeng, et al.
Published: (2023)

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)

Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
by: Lee, Chi-Chang, et al.
Published: (2025)

Transformers Provably Implement In-Context Reinforcement Learning with Policy Improvement
by: Liang, Haodong, et al.
Published: (2026)

Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation
by: Verdù, Federico Julian Camerota, et al.
Published: (2024)

Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space
by: Figliolia, Tomas, et al.
Published: (2025)

Reward Weighted Classifier-Free Guidance as Policy Improvement in Autoregressive Models
by: Peysakhovich, Alexander, et al.
Published: (2026)

SIME: Enhancing Policy Self-Improvement with Modal-level Exploration
by: Jin, Yang, et al.
Published: (2025)

Model Space Reasoning as Search in Feedback Space for Planning Domain Generation
by: Oswald, James, et al.
Published: (2026)

Functional Equivalence with NARS
by: Johansson, Robert, et al.
Published: (2024)

Intriguing Equivalence Structures of the Embedding Space of Vision Transformers
by: Salman, Shaeke, et al.
Published: (2024)

Searching Latent Program Spaces
by: Macfarlane, Matthew V, et al.
Published: (2024)

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning
by: Liu, Ziwen, et al.
Published: (2026)

Subgoal-Guided Policy Heuristic Search with Learned Subgoals
by: Tuero, Jake, et al.
Published: (2025)

Agent-Driven Autonomous Reinforcement Learning Research: Iterative Policy Improvement for Quadruped Locomotion
by: Khandelwal, Nimesh, et al.
Published: (2026)

J4R: Learning to Judge with Equivalent Initial State Group Relative Policy Optimization
by: Xu, Austin, et al.
Published: (2025)

Adaptive Compression of the Latent Space in Variational Autoencoders
by: Sejnova, Gabriela, et al.
Published: (2023)

Latent Reasoning in TRMs is Secretly a Policy Improvement Operator
by: Asadulaev, Arip, et al.
Published: (2025)

Extracting Problem Structure with LLMs for Optimized SAT Local Search
by: Schidler, André, et al.
Published: (2025)

Zero-shot Imitation Policy via Search in Demonstration Dataset
by: Malato, Federco, et al.
Published: (2024)

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
by: Liu, Shiyu, et al.
Published: (2026)

Study and Improvement of Search Algorithms in Multi-Player Perfect-Information Games
by: Cohen-Solal, Quentin
Published: (2026)