Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bazyleva, Valentina, Bonettini, Nicolo, Bharaj, Gaurav
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.11753
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908368732422144
author	Bazyleva, Valentina Bonettini, Nicolo Bharaj, Gaurav
author_facet	Bazyleva, Valentina Bonettini, Nicolo Bharaj, Gaurav
contents	Text-guided diffusion models have significantly advanced image editing, enabling highly realistic and local modifications based on textual prompts. While these developments expand creative possibilities, their malicious use poses substantial challenges for detection of such subtle deepfake edits. To this end, we introduce Explain Edit (X-Edit), a novel method for localizing diffusion-based edits in images. To localize the edits for an image, we invert the image using a pretrained diffusion model, then use these inverted features as input to a segmentation network that explicitly predicts the edited masked regions via channel and spatial attention. Further, we finetune the model using a combined segmentation and relevance loss. The segmentation loss ensures accurate mask prediction by balancing pixel-wise errors and perceptual similarity, while the relevance loss guides the model to focus on low-frequency regions and mitigate high-frequency artifacts, enhancing the localization of subtle edits. To the best of our knowledge, we are the first to address and model the problem of localizing diffusion-based modified regions in images. We additionally contribute a new dataset of paired original and edited images addressing the current lack of resources for this task. Experimental results demonstrate that X-Edit accurately localizes edits in images altered by text-guided diffusion models, outperforming baselines in PSNR and SSIM metrics. This highlights X-Edit's potential as a robust forensic tool for detecting and pinpointing manipulations introduced by advanced image editing techniques.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_11753
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	X-Edit: Detecting and Localizing Edits in Images Altered by Text-Guided Diffusion Models Bazyleva, Valentina Bonettini, Nicolo Bharaj, Gaurav Computer Vision and Pattern Recognition Text-guided diffusion models have significantly advanced image editing, enabling highly realistic and local modifications based on textual prompts. While these developments expand creative possibilities, their malicious use poses substantial challenges for detection of such subtle deepfake edits. To this end, we introduce Explain Edit (X-Edit), a novel method for localizing diffusion-based edits in images. To localize the edits for an image, we invert the image using a pretrained diffusion model, then use these inverted features as input to a segmentation network that explicitly predicts the edited masked regions via channel and spatial attention. Further, we finetune the model using a combined segmentation and relevance loss. The segmentation loss ensures accurate mask prediction by balancing pixel-wise errors and perceptual similarity, while the relevance loss guides the model to focus on low-frequency regions and mitigate high-frequency artifacts, enhancing the localization of subtle edits. To the best of our knowledge, we are the first to address and model the problem of localizing diffusion-based modified regions in images. We additionally contribute a new dataset of paired original and edited images addressing the current lack of resources for this task. Experimental results demonstrate that X-Edit accurately localizes edits in images altered by text-guided diffusion models, outperforming baselines in PSNR and SSIM metrics. This highlights X-Edit's potential as a robust forensic tool for detecting and pinpointing manipulations introduced by advanced image editing techniques.
title	X-Edit: Detecting and Localizing Edits in Images Altered by Text-Guided Diffusion Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2505.11753

Similar Items