Enregistré dans:
Détails bibliographiques
Auteurs principaux: Li, Wenxuan, Zhang, Zhenfei, Zhang, Mi, Hong, Geng, Wen, Mi, You, Xiaoyu, Yang, Min
Format: Preprint
Publié: 2026
Sujets:
Accès en ligne:https://arxiv.org/abs/2604.13777
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Table des matières:
  • Large language models (LLMs) may memorize sensitive or copyrighted content, raising significant privacy and legal concerns. While machine unlearning has emerged as a potential remedy, prevailing paradigms rely on user-provided forget sets, making unlearning requests difficult to audit and exposing systems to secondary leakage and malicious abuse. We propose MAGE, a Memory-grAph Guided Erasure framework for user-minimized, corpus-free unlearning. Given only a lightweight user anchor that identifies a target entity, MAGE probes the target LLM to recover target-related memorization, organizes it into a weighted local memory graph, and synthesizes scoped supervision for unlearning. MAGE is model-agnostic, can be plugged into standard unlearning methods, and requires no access to the original training corpus. Experiments on two benchmarks, TOFU and RWKU, demonstrate that MAGE's self-generated supervision achieves effective unlearning performance comparable to supervision generated with external reference, while preserving overall utility. These results support a practical and auditable unlearning workflow driven by minimal anchors rather than user-supplied forget corpora.