Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhong, Weihao, Xiao, Yinhao, Xu, Minghui, Cheng, Xiuzhen
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2411.10032
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929592488427520
author	Zhong, Weihao Xiao, Yinhao Xu, Minghui Cheng, Xiuzhen
author_facet	Zhong, Weihao Xiao, Yinhao Xu, Minghui Cheng, Xiuzhen
contents	Short video platforms have become important channels for news dissemination, offering a highly engaging and immediate way for users to access current events and share information. However, these platforms have also emerged as significant conduits for the rapid spread of misinformation, as fake news and rumors can leverage the visual appeal and wide reach of short videos to circulate extensively among audiences. Existing fake news detection methods mainly rely on single-modal information, such as text or images, or apply only basic fusion techniques, limiting their ability to handle the complex, multi-layered information inherent in short videos. To address these limitations, this paper presents a novel fake news detection method based on multimodal information, designed to identify misinformation through a multi-level analysis of video content. This approach effectively utilizes different modal representations to generate a unified textual description, which is then fed into a large language model for comprehensive evaluation. The proposed framework successfully integrates multimodal features within videos, significantly enhancing the accuracy and reliability of fake news detection. Experimental results demonstrate that the proposed approach outperforms existing models in terms of accuracy, robustness, and utilization of multimodal information, achieving an accuracy of 90.93%, which is significantly higher than the best baseline model (SV-FEND) at 81.05%. Furthermore, case studies provide additional evidence of the effectiveness of the approach in accurately distinguishing between fake news, debunking content, and real incidents, highlighting its reliability and robustness in real-world applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_10032
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos Zhong, Weihao Xiao, Yinhao Xu, Minghui Cheng, Xiuzhen Computer Vision and Pattern Recognition Artificial Intelligence Short video platforms have become important channels for news dissemination, offering a highly engaging and immediate way for users to access current events and share information. However, these platforms have also emerged as significant conduits for the rapid spread of misinformation, as fake news and rumors can leverage the visual appeal and wide reach of short videos to circulate extensively among audiences. Existing fake news detection methods mainly rely on single-modal information, such as text or images, or apply only basic fusion techniques, limiting their ability to handle the complex, multi-layered information inherent in short videos. To address these limitations, this paper presents a novel fake news detection method based on multimodal information, designed to identify misinformation through a multi-level analysis of video content. This approach effectively utilizes different modal representations to generate a unified textual description, which is then fed into a large language model for comprehensive evaluation. The proposed framework successfully integrates multimodal features within videos, significantly enhancing the accuracy and reliability of fake news detection. Experimental results demonstrate that the proposed approach outperforms existing models in terms of accuracy, robustness, and utilization of multimodal information, achieving an accuracy of 90.93%, which is significantly higher than the best baseline model (SV-FEND) at 81.05%. Furthermore, case studies provide additional evidence of the effectiveness of the approach in accurately distinguishing between fake news, debunking content, and real incidents, highlighting its reliability and robustness in real-world applications.
title	VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2411.10032

Similar Items