Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Kaixiang, Shen, Boyang, Li, Xin, Dai, Yuchen, Luo, Yuxuan, Ma, Yueran, Fang, Wei, Li, Qiang, Wang, Zhiwei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.12151
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911267799695360
author	Yang, Kaixiang Shen, Boyang Li, Xin Dai, Yuchen Luo, Yuxuan Ma, Yueran Fang, Wei Li, Qiang Wang, Zhiwei
author_facet	Yang, Kaixiang Shen, Boyang Li, Xin Dai, Yuchen Luo, Yuxuan Ma, Yueran Fang, Wei Li, Qiang Wang, Zhiwei
contents	Text-guided image editing has advanced rapidly with the rise of diffusion models. While flow-based inversion-free methods offer high efficiency by avoiding latent inversion, they often fail to effectively integrate source information, leading to poor background preservation, spatial inconsistencies, and over-editing due to the lack of effective integration of source information. In this paper, we present FIA-Edit, a novel inversion-free framework that achieves high-fidelity and semantically precise edits through a Frequency-Interactive Attention. Specifically, we design two key components: (1) a Frequency Representation Interaction (FRI) module that enhances cross-domain alignment by exchanging frequency components between source and target features within self-attention, and (2) a Feature Injection (FIJ) module that explicitly incorporates source-side queries, keys, values, and text embeddings into the target branch's cross-attention to preserve structure and semantics. Comprehensive and extensive experiments demonstrate that FIA-Edit supports high-fidelity editing at low computational cost (~6s per 512 * 512 image on an RTX 4090) and consistently outperforms existing methods across diverse tasks in visual quality, background fidelity, and controllability. Furthermore, we are the first to extend text-guided image editing to clinical applications. By synthesizing anatomically coherent hemorrhage variations in surgical images, FIA-Edit opens new opportunities for medical data augmentation and delivers significant gains in downstream bleeding classification. Our project is available at: https://github.com/kk42yy/FIA-Edit.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_12151
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	FIA-Edit: Frequency-Interactive Attention for Efficient and High-Fidelity Inversion-Free Text-Guided Image Editing Yang, Kaixiang Shen, Boyang Li, Xin Dai, Yuchen Luo, Yuxuan Ma, Yueran Fang, Wei Li, Qiang Wang, Zhiwei Computer Vision and Pattern Recognition Text-guided image editing has advanced rapidly with the rise of diffusion models. While flow-based inversion-free methods offer high efficiency by avoiding latent inversion, they often fail to effectively integrate source information, leading to poor background preservation, spatial inconsistencies, and over-editing due to the lack of effective integration of source information. In this paper, we present FIA-Edit, a novel inversion-free framework that achieves high-fidelity and semantically precise edits through a Frequency-Interactive Attention. Specifically, we design two key components: (1) a Frequency Representation Interaction (FRI) module that enhances cross-domain alignment by exchanging frequency components between source and target features within self-attention, and (2) a Feature Injection (FIJ) module that explicitly incorporates source-side queries, keys, values, and text embeddings into the target branch's cross-attention to preserve structure and semantics. Comprehensive and extensive experiments demonstrate that FIA-Edit supports high-fidelity editing at low computational cost (~6s per 512 * 512 image on an RTX 4090) and consistently outperforms existing methods across diverse tasks in visual quality, background fidelity, and controllability. Furthermore, we are the first to extend text-guided image editing to clinical applications. By synthesizing anatomically coherent hemorrhage variations in surgical images, FIA-Edit opens new opportunities for medical data augmentation and delivers significant gains in downstream bleeding classification. Our project is available at: https://github.com/kk42yy/FIA-Edit.
title	FIA-Edit: Frequency-Interactive Attention for Efficient and High-Fidelity Inversion-Free Text-Guided Image Editing
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2511.12151

Similar Items