:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Oesterheld, Caspar, Riché, Maxime, Sondej, Filip, Clifton, Jesse, Conitzer, Vincent
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.04341
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Can CDT rationalise the ex ante optimal policy via modified anthropics?
by: Cooper, Emery, et al.
Published: (2024)

Recursive Joint Simulation in Games
by: Kovarik, Vojtech, et al.
Published: (2024)

Characterising Simulation-Based Program Equilibria
by: Cooper, Emery, et al.
Published: (2024)

Observation Interference in Partially Observable Assistance Games
by: Emmons, Scott, et al.
Published: (2024)

Choosing What Game to Play without Selecting Equilibria: Inferring Safe (Pareto) Improvements in Binary Constraint Structures
by: Oesterheld, Caspar, et al.
Published: (2025)

Collapse of Irrelevant Representations (CIR) Ensures Robust and Non-Disruptive LLM Unlearning
by: Sondej, Filip, et al.
Published: (2025)

Maximizing Social Welfare with Side Payments
by: Geffner, Ivan, et al.
Published: (2025)

Game Theory with Simulation of Other Players
by: Kovarik, Vojtech, et al.
Published: (2023)

Computing Game Symmetries and Equilibria That Respect Them
by: Tewolde, Emanuel, et al.
Published: (2025)

Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And Normalization
by: Sondej, Filip, et al.
Published: (2025)

Why should we ever automate moral decision making?
by: Conitzer, Vincent
Published: (2024)

Shutdown Safety Valves for Advanced AI
by: Conitzer, Vincent
Published: (2026)

Imperfect-Recall Games: Equilibrium Concepts and Their Complexity
by: Tewolde, Emanuel, et al.
Published: (2024)

How Many Votes is a Lie Worth? Measuring Strategyproofness through Resource Augmentation
by: Berker, Ratip Emin, et al.
Published: (2026)

A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
by: Oesterheld, Caspar, et al.
Published: (2024)

Golden Handcuffs make safer AI agents
by: Ebtekar, Aram, et al.
Published: (2026)

Complexity of Scheduling Charging in the Smart Grid
by: de Weerdt, Mathijs, et al.
Published: (2017)

RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring
by: Oueslati, Khouloud, et al.
Published: (2025)

MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces
by: Gaven, Loris, et al.
Published: (2025)

Object criticality for safer navigation
by: Ceccarelli, Andrea, et al.
Published: (2024)

Computing Optimal Commitments to Strategies and Outcome-Conditional Utility Transfers
by: Sauerberg, Nathaniel, et al.
Published: (2024)

Promises Made, Promises Kept: Safe Pareto Improvements via Ex Post Verifiable Commitments
by: Sauerberg, Nathaniel, et al.
Published: (2025)

Efficiently Solving Turn-Taking Stochastic Games with Extensive-Form Correlation
by: Zhang, Hanrui, et al.
Published: (2024)

An Interpretable Automated Mechanism Design Framework with Large Language Models
by: Liu, Jiayuan, et al.
Published: (2025)

Game Theory with Simulation in the Presence of Unpredictable Randomisation
by: Kovarik, Vojtech, et al.
Published: (2024)

Merging plans with incomplete knowledge about actions and goals through an agent-based reputation system
by: Carbo, Javier, et al.
Published: (2024)

Multi-Agent Security Tax: Trading Off Security and Collaboration Capabilities in Multi-Agent Systems
by: Peigne-Lefebvre, Pierre, et al.
Published: (2025)

An agent design with goal reaching guarantees for enhancement of learning
by: Osinenko, Pavel, et al.
Published: (2024)

Building surrogate models using trajectories of agents trained by Reinforcement Learning
by: Cestero, Julen, et al.
Published: (2025)

The Consensus Trap: Rescuing Multi-Agent LLMs from Adversarial Majorities via Token-Level Collaboration
by: Liu, Jiayuan, et al.
Published: (2026)

Prior preferences in active inference agents: soft, hard, and goal shaping
by: Torresan, Filippo, et al.
Published: (2025)

Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time
by: Tan, Daniel, et al.
Published: (2025)

AI Testing Should Account for Sophisticated Strategic Behaviour
by: Kovarik, Vojtech, et al.
Published: (2025)

No Reliable Evidence of Self-Reported Sentience in Small Large Language Models
by: Kaiser, Caspar, et al.
Published: (2026)

The Rise of Diffusion Models in Time-Series Forecasting
by: Meijer, Caspar, et al.
Published: (2024)

Improving LLM-Generated Code Quality with GRPO
by: Robeyns, Maxime, et al.
Published: (2025)

LinearizeLLM: An Agent-Based Framework for LLM-Driven Exact Linear Reformulation of Nonlinear Optimization Problems
by: Kandora, Paul-Niklas Ken, et al.
Published: (2025)

Learning to Deliberate: Meta-policy Collaboration for Agentic LLMs with Multi-agent Reinforcement Learning
by: Yang, Wei, et al.
Published: (2025)

ResMAS: Resilience Optimization in LLM-based Multi-agent Systems
by: Zhou, Zhilun, et al.
Published: (2026)

Artificial-Intelligence Grading Assistance for Handwritten Components of a Calculus Exam
by: Kortemeyer, Gerd, et al.
Published: (2025)