Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Nguyen, Thong Q., Desai, Shubhang, Anwar, Raja Hasnain, Shaik, Firoz, Suryanarayanan, Vishwas, Chowdhary, Vishal
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2506.02539
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Continual memory augmentation lets computer-using agents (CUAs) learn from prior interactions, but unvetted memories can encode domain-inappropriate or unsafe heuristics--spurious rules that drift from user intent and safety constraints. We introduce VerificAgent, a scalable oversight framework that treats persistent memory as an explicit alignment surface. VerificAgent combines (1) an expert-curated seed of domain knowledge, (2) iterative, trajectory-based memory growth during training, and (3) a post-hoc human fact-checking pass to sanitize accumulated memories before deployment. Evaluated on OSWorld productivity tasks and additional adversarial stress tests, VerificAgent improves task reliability, reduces hallucination-induced failures, and preserves interpretable, auditable guidance--without additional model fine-tuning. By letting humans correct high-impact errors once, the verified memory acts as a frozen safety contract that future agent actions must satisfy. Our results suggest that domain-scoped, human-verified memory offers a scalable oversight mechanism for CUAs, complementing broader alignment strategies by limiting silent policy drift and anchoring agent behavior to the norms and safety constraints of the target domain.

Similar Items