:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hu, Yihan, Peng, Jianing, Lin, Yiheng, Liu, Ting, Qu, Xiaochao, Liu, Luoqi, Zhao, Yao, Wei, Yunchao
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.16795
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Memory Efficient Matting with Adaptive Token Routing
by: Lin, Yiheng, et al.
Published: (2024)

AlignGen: Boosting Personalized Image Generation with Cross-Modality Prior Alignment
by: Lin, Yiheng, et al.
Published: (2025)

On Exact Editing of Flow-Based Diffusion Models
by: Li, Zixiang, et al.
Published: (2025)

GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
by: Wang, Tong, et al.
Published: (2025)

MiVE: Multiscale Vision-language features for reference-guided video Editing
by: Wang, Tong, et al.
Published: (2026)

Diffusion for Natural Image Matting
by: Hu, Yihan, et al.
Published: (2023)

Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning
by: Li, Hongxi, et al.
Published: (2026)

FlowSeg: Dynamic Semantic Guidance for LLM-Conditioned Segmentation
by: Zhang, Zekang, et al.
Published: (2026)

Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?
by: Liang, Chen, et al.
Published: (2024)

Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching
by: Liu, Minghao, et al.
Published: (2024)

MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
by: Huang, Jun, et al.
Published: (2025)

SAM-REF: Introducing Image-Prompt Synergy during Interaction for Detail Enhancement in the Segment Anything Model
by: Yu, Chongkai, et al.
Published: (2024)

EVPGS: Enhanced View Prior Guidance for Splatting-based Extrapolated View Synthesis
by: Li, Jiahe, et al.
Published: (2025)

TextMastero: Mastering High-Quality Scene Text Editing in Diverse Languages and Styles
by: Wang, Tong, et al.
Published: (2024)

CutClaw: Agentic Hours-Long Video Editing via Music Synchronization
by: Zhao, Shifang, et al.
Published: (2026)

Learning Stochastic Bridges for Video Object Removal via Video-to-Video Translation
by: Lou, Zijie, et al.
Published: (2026)

Learning Trimaps via Clicks for Image Matting
by: Zhang, Chenyi, et al.
Published: (2024)

OmniAD: Detect and Understand Industrial Anomaly via Multimodal Reasoning
by: Zhao, Shifang, et al.
Published: (2025)

DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing
by: Li, Zixiang, et al.
Published: (2025)

PortraitCraft: A Benchmark for Portrait Composition Understanding and Generation
by: Sha, Yuyang, et al.
Published: (2026)

IPSeg: Image Posterior Mitigates Semantic Drift in Class-Incremental Segmentation
by: Yu, Xiao, et al.
Published: (2025)

S$^2$Edit: Text-Guided Image Editing with Precise Semantic and Spatial Control
by: Liu, Xudong, et al.
Published: (2025)

CharaConsist: Fine-Grained Consistent Character Generation
by: Wang, Mengyu, et al.
Published: (2025)

Semantic Segmentation on VSPW Dataset through Masked Video Consistency
by: Liang, Chen, et al.
Published: (2024)

2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation
by: Xu, Zhensong, et al.
Published: (2024)

Research on the application of graph data structure and graph neural network in node classification/clustering tasks
by: Wang, Yihan, et al.
Published: (2025)

Can a Second-View Image Be a Language? Geometric and Semantic Cross-Modal Reasoning for X-ray Prohibited Item Detection
by: Peng, Chuang, et al.
Published: (2025)

Region-Adaptive Transform with Segmentation Prior for Image Compression
by: Liu, Yuxi, et al.
Published: (2024)

View-Consistent 3D Scene Editing via Dual-Path Structural Correspondense and Semantic Continuity
by: Li, Pufan, et al.
Published: (2026)

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?
by: Tao, Renshuai, et al.
Published: (2024)

TextSculptor: Training and Benchmarking Scene Text Editing
by: Lin, Yiheng, et al.
Published: (2026)

ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model
by: Han, Kunyang, et al.
Published: (2024)

Precision Control of Cell Type‐Specific Behavior via RNA Sensing and Editing
by: Lulu Xiao, et al.
Published: (2024)

Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation
by: Zhang, Bingfeng, et al.
Published: (2024)

VTEdit-Bench: A Comprehensive Benchmark for Multi-Reference Image Editing Models in Virtual Try-On
by: Liang, Xiaoye, et al.
Published: (2026)

Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images
by: Tan, Chuangchuang, et al.
Published: (2025)

A Fully Self-Synchronized Control for Hybrid Series-Parallel Electronized Power Networks
by: Wei, Zexiong, et al.
Published: (2025)

LORE: Latent Optimization for Precise Semantic Control in Rectified Flow-based Image Editing
by: Ouyang, Liangyang, et al.
Published: (2025)

Distributed Conditional Feature Screening via Pearson Partial Correlation with FDR Control
by: Pang, Naiwen, et al.
Published: (2024)

Image Sculpting: Precise Object Editing with 3D Geometry Control
by: Yenphraphai, Jiraphon, et al.
Published: (2024)