Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yao, Zihan, He, Yu, Qi, Tianyu, Li, Ming
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2404.02699
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914904918720512
author	Yao, Zihan He, Yu Qi, Tianyu Li, Ming
author_facet	Yao, Zihan He, Yu Qi, Tianyu Li, Ming
contents	Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-the-art results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_02699
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Scalable Model Editing via Customized Expert Networks Yao, Zihan He, Yu Qi, Tianyu Li, Ming Computation and Language Addressing the issues of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on non-edited samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding indexing neuron for each expert to control the activation state of that expert. We conducted a series of experiments on the ZsRE and Hallucination benchmarks by tuning the advanced open-source LLM, Llama2, achieving state-of-the-art results compared to current mainstream methods. Our code is available at https://github.com/TAL-auroraX/SCEN.
title	Scalable Model Editing via Customized Expert Networks
topic	Computation and Language
url	https://arxiv.org/abs/2404.02699

Similar Items