Salvato in:
Dettagli Bibliografici
Autori principali: Wang, Qitong, Zaki, Mohammed J., Kollias, Georgios, Kalantzis, Vasileios
Natura: Preprint
Pubblicazione: 2026
Soggetti:
Accesso online:https://arxiv.org/abs/2602.22351
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866911469499580416
author Wang, Qitong
Zaki, Mohammed J.
Kollias, Georgios
Kalantzis, Vasileios
author_facet Wang, Qitong
Zaki, Mohammed J.
Kollias, Georgios
Kalantzis, Vasileios
contents Large language models (LLMs) learn contextual embeddings that capture rich semantic information, yet they often overlook structured lexical knowledge such as word senses and relationships. Prior work has shown that incorporating sense dictionaries can improve knowledge distillation for encoder models, but their application to decoder as generative models remains challenging. In this paper, we introduce Decoder-based Sense Knowledge Distillation (DSKD), a framework that integrates lexical resources into the training of decoder-style LLMs without requiring dictionary lookup at inference time. Extensive experiments on diverse benchmarks demonstrate that DSKD significantly enhances knowledge distillation performance for decoders, enabling generative models to inherit structured semantics while maintaining efficient training.
format Preprint
id arxiv_https___arxiv_org_abs_2602_22351
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Decoder-based Sense Knowledge Distillation
Wang, Qitong
Zaki, Mohammed J.
Kollias, Georgios
Kalantzis, Vasileios
Computation and Language
Artificial Intelligence
Large language models (LLMs) learn contextual embeddings that capture rich semantic information, yet they often overlook structured lexical knowledge such as word senses and relationships. Prior work has shown that incorporating sense dictionaries can improve knowledge distillation for encoder models, but their application to decoder as generative models remains challenging. In this paper, we introduce Decoder-based Sense Knowledge Distillation (DSKD), a framework that integrates lexical resources into the training of decoder-style LLMs without requiring dictionary lookup at inference time. Extensive experiments on diverse benchmarks demonstrate that DSKD significantly enhances knowledge distillation performance for decoders, enabling generative models to inherit structured semantics while maintaining efficient training.
title Decoder-based Sense Knowledge Distillation
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2602.22351