Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Or, Tom, Azencot, Omri
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2508.00784
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866916875987845120
author	Or, Tom Azencot, Omri
author_facet	Or, Tom Azencot, Omri
contents	Generative models achieve remarkable results in multiple data domains, including images and texts, among other examples. Unfortunately, malicious users exploit synthetic media for spreading misinformation and disseminating deepfakes. Consequently, the need for robust and stable fake detectors is pressing, especially when new generative models appear everyday. While the majority of existing work train classifiers that discriminate between real and fake information, such tools typically generalize only within the same family of generators and data modalities, yielding poor results on other generative classes and data domains. Towards a universal classifier, we propose the use of large pre-trained multi-modal models for the detection of generative content. Effectively, we show that the latent code of these models naturally captures information discriminating real from fake. Building on this observation, we demonstrate that linear classifiers trained on these features can achieve state-of-the-art results across various modalities, while remaining computationally efficient, fast to train, and effective even in few-shot settings. Our work primarily focuses on fake detection in audio and images, achieving performance that surpasses or matches that of strong baseline methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2508_00784
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics Or, Tom Azencot, Omri Artificial Intelligence Generative models achieve remarkable results in multiple data domains, including images and texts, among other examples. Unfortunately, malicious users exploit synthetic media for spreading misinformation and disseminating deepfakes. Consequently, the need for robust and stable fake detectors is pressing, especially when new generative models appear everyday. While the majority of existing work train classifiers that discriminate between real and fake information, such tools typically generalize only within the same family of generators and data modalities, yielding poor results on other generative classes and data domains. Towards a universal classifier, we propose the use of large pre-trained multi-modal models for the detection of generative content. Effectively, we show that the latent code of these models naturally captures information discriminating real from fake. Building on this observation, we demonstrate that linear classifiers trained on these features can achieve state-of-the-art results across various modalities, while remaining computationally efficient, fast to train, and effective even in few-shot settings. Our work primarily focuses on fake detection in audio and images, achieving performance that surpasses or matches that of strong baseline methods.
title	Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics
topic	Artificial Intelligence
url	https://arxiv.org/abs/2508.00784

Ähnliche Einträge