Saved in:
Bibliographic Details
Main Authors: Yao, Zihan, He, Yu, Qi, Tianyu, Li, Ming
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.02699
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914904918720512
author Yao, Zihan
He, Yu
Qi, Tianyu
Li, Ming
author_facet Yao, Zihan
He, Yu
Qi, Tianyu
Li, Ming
contents Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-the-art results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.
format Preprint
id arxiv_https___arxiv_org_abs_2404_02699
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Scalable Model Editing via Customized Expert Networks
Yao, Zihan
He, Yu
Qi, Tianyu
Li, Ming
Computation and Language
Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-the-art results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.
title Scalable Model Editing via Customized Expert Networks
topic Computation and Language
url https://arxiv.org/abs/2404.02699