Saved in:
Bibliographic Details
Main Authors: Zhang, Zefeng, Sheng, Jiawei, Zhang, Chuang, Liang, Yunzhi, Zhang, Wenyuan, Wang, Siqi, Liu, Tingwen
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.01934
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911904647086080
author Zhang, Zefeng
Sheng, Jiawei
Zhang, Chuang
Liang, Yunzhi
Zhang, Wenyuan
Wang, Siqi
Liu, Tingwen
author_facet Zhang, Zefeng
Sheng, Jiawei
Zhang, Chuang
Liang, Yunzhi
Zhang, Wenyuan
Wang, Siqi
Liu, Tingwen
contents Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically learned attention weights, which may over-concentrate on partial correlations. To mitigate this issue, we formulate the correlation assignment problem as an optimal transport (OT) problem, and propose a novel MEL framework, namely OT-MEL, with OT-guided correlation assignment. Thereby, we exploit the correlation between multimodal features to enhance multimodal fusion, and the correlation between mentions and entities to enhance fine-grained matching. To accelerate model prediction, we further leverage knowledge distillation to transfer OT assignment knowledge to attention mechanism. Experimental results show that our model significantly outperforms previous state-of-the-art baselines and confirm the effectiveness of the OT-guided correlation assignment.
format Preprint
id arxiv_https___arxiv_org_abs_2406_01934
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Optimal Transport Guided Correlation Assignment for Multimodal Entity Linking
Zhang, Zefeng
Sheng, Jiawei
Zhang, Chuang
Liang, Yunzhi
Zhang, Wenyuan
Wang, Siqi
Liu, Tingwen
Computation and Language
Multimodal Entity Linking (MEL) aims to link ambiguous mentions in multimodal contexts to entities in a multimodal knowledge graph. A pivotal challenge is to fully leverage multi-element correlations between mentions and entities to bridge modality gap and enable fine-grained semantic matching. Existing methods attempt several local correlative mechanisms, relying heavily on the automatically learned attention weights, which may over-concentrate on partial correlations. To mitigate this issue, we formulate the correlation assignment problem as an optimal transport (OT) problem, and propose a novel MEL framework, namely OT-MEL, with OT-guided correlation assignment. Thereby, we exploit the correlation between multimodal features to enhance multimodal fusion, and the correlation between mentions and entities to enhance fine-grained matching. To accelerate model prediction, we further leverage knowledge distillation to transfer OT assignment knowledge to attention mechanism. Experimental results show that our model significantly outperforms previous state-of-the-art baselines and confirm the effectiveness of the OT-guided correlation assignment.
title Optimal Transport Guided Correlation Assignment for Multimodal Entity Linking
topic Computation and Language
url https://arxiv.org/abs/2406.01934