:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Malloy, Tailia, Klinger, Tim, Liu, Miao, Riemer, Matthew, Tesauro, Gerald, Sims, Chris R.
Format:	Preprint
Published:	2020
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2011.11517
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Deep RL With Information Constrained Policies: Generalization in Continuous Control
by: Malloy, Tailia, et al.
Published: (2020)

Learning in Factored Domains with Information-Constrained Visual Representations
by: Malloy, Tailia, et al.
Published: (2023)

Learning to Defend by Attacking (and Vice-Versa): Transfer of Learning in Cybersecurity Games
by: Malloy, Tailia, et al.
Published: (2023)

Assessing Spear-Phishing Website Generation in Large Language Model Coding Agents
by: Malloy, Tailia, et al.
Published: (2026)

On-line Policy Improvement using Monte-Carlo Search
by: Tesauro, Gerald, et al.
Published: (2025)

Beyond Sliding Windows: Learning to Manage Memory in Non-Markovian Environments
by: Tasse, Geraud Nangue, et al.
Published: (2025)

Modeling Attention during Dimensional Shifts with Counterfactual and Delayed Feedback
by: Malloy, Tailia, et al.
Published: (2025)

Zero Knowledge Games
by: Malloy, Ian
Published: (2020)

The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models
by: Riemer, Matthew, et al.
Published: (2025)

Training Users Against Human and GPT-4 Generated Social Engineering Attacks
by: Malloy, Tailia, et al.
Published: (2025)

Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content
by: Malloy, Tailia, et al.
Published: (2024)

The Ultimate Test of Superintelligent AI Agents: Can an AI Balance Care and Control in Asymmetric Relationships?
by: Bouneffouf, Djallel, et al.
Published: (2025)

Exploring Information Seeking Agent Consolidation
by: Yan, Guochen, et al.
Published: (2026)

Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration
by: Zhao, Yang, et al.
Published: (2026)

SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent
by: Cao, Shiyi, et al.
Published: (2025)

One Policy, Infinite NPCs: Persona-Traceable Shared RL Policies for Scalable Game Agents
by: Hong, Yoosung
Published: (2026)

Multi-Agent Path Finding via Offline RL and LLM Collaboration
by: Atasever, Merve, et al.
Published: (2025)

Safe Heterogeneous Multi-Agent RL with Communication Regularization for Coordinated Target Acquisition
by: Calzolari, Gabriele, et al.
Published: (2026)

O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL
by: Yao, Yi, et al.
Published: (2026)

The Role of Deep Learning Regularizations on Actors in Offline RL
by: Tarasov, Denis, et al.
Published: (2024)

Position: Theory of Mind Benchmarks are Broken for Large Language Models
by: Riemer, Matthew, et al.
Published: (2024)

ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents
by: Zhang, Hao, et al.
Published: (2026)

JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
by: Rutherford, Alexander, et al.
Published: (2023)

Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization
by: Li, Simin, et al.
Published: (2023)

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL
by: Li, Weizhen, et al.
Published: (2025)

Learning to Beat ByteRL: Exploitability of Collectible Card Game Agents
by: Haluska, Radovan, et al.
Published: (2024)

Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers
by: Xin, Ran, et al.
Published: (2025)

Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
by: Li, Haoran, et al.
Published: (2024)

Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference
by: Riemer, Matthew, et al.
Published: (2024)

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making
by: Subramanian, Chitra, et al.
Published: (2024)

What makes Models Compositional? A Theoretical View: With Supplement
by: Ram, Parikshit, et al.
Published: (2024)

Cooperative Game-Theoretic Credit Assignment for Multi-Agent Policy Gradients via the Core
by: Ji, Mengda, et al.
Published: (2025)

Style-Preserving Policy Optimization for Game Agents
by: Li, Lingfeng, et al.
Published: (2025)

Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
by: Huang, Jiawei, et al.
Published: (2024)

A Sentiment Consolidation Framework for Meta-Review Generation
by: Li, Miao, et al.
Published: (2024)

Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
by: Ged, François, et al.
Published: (2023)

DeepStock: Reinforcement Learning with Policy Regularizations for Inventory Management
by: Xie, Yaqi, et al.
Published: (2026)

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs
by: Zeng, Yifan, et al.
Published: (2026)

MCPO: Mastery-Consolidated Policy Optimization for Large Reasoning Models
by: Liao, Zhaokang, et al.
Published: (2026)

MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism
by: Liu, Shulin, et al.
Published: (2025)