:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Ziqian, Alaniz, Stephan
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.05180
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence
by: Liu, Chonghan, et al.
Published: (2025)

A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
by: Girrbach, Leander, et al.
Published: (2025)

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing
by: Liu, Mingxin, et al.
Published: (2026)

PRIMEdit: Probability Redistribution for Instance-aware Multi-object Video Editing with Benchmark Dataset
by: Teodoro, Samuel, et al.
Published: (2024)

MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
by: Benito, Miguel Diaz, et al.
Published: (2026)

MIRAGE: Runtime Scheduling for Multi-Vector Image Retrieval with Hierarchical Decomposition
by: Li, Maoliang, et al.
Published: (2025)

Training-Free Multi-Concept Image Editing
by: Foteinopoulou, Niki, et al.
Published: (2026)

Large-scale EM Benchmark for Multi-Organelle Instance Segmentation in the Wild
by: Lu, Yanrui, et al.
Published: (2026)

CAMEO: A Conditional and Quality-Aware Multi-Agent Image Editing Orchestrator
by: Pu, Yuhan, et al.
Published: (2026)

SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions
by: Bader, Jessica, et al.
Published: (2025)

GEditBench v2: A Human-Aligned Benchmark for General Image Editing
by: Jiang, Zhangqi, et al.
Published: (2026)

MIRAGE: Towards AI-Generated Image Detection in the Wild
by: Xia, Cheng, et al.
Published: (2025)

Shifting the Breaking Point of Flow Matching for Multi-Instance Editing
by: Zaccagnino, Carmine, et al.
Published: (2026)

SeedEdit: Align Image Re-Generation to Image Editing
by: Shi, Yichun, et al.
Published: (2024)

Human-Aligned MLLM Judges for Fine-Grained Image Editing Evaluation: A Benchmark, Framework, and Analysis
by: Liu, Runzhou, et al.
Published: (2026)

LoFT: LoRA-fused Training Dataset Generation with Few-shot Guidance
by: Kim, Jae Myung, et al.
Published: (2025)

FLAIR: VLM with Fine-grained Language-informed Image Representations
by: Xiao, Rui, et al.
Published: (2024)

TabletopGen: Instance-Level Interactive 3D Tabletop Scene Generation from Text or Single Image
by: Wang, Ziqian, et al.
Published: (2025)

Explaining CLIP Zero-shot Predictions Through Concepts
by: Ozdemir, Onat, et al.
Published: (2026)

Training-free Uncertainty Guidance for Complex Visual Tasks with MLLMs
by: Kim, Sanghwan, et al.
Published: (2025)

VTEdit-Bench: A Comprehensive Benchmark for Multi-Reference Image Editing Models in Virtual Try-On
by: Liang, Xiaoye, et al.
Published: (2026)

Instance-Aligned Captions for Explainable Video Anomaly Detection
by: Song, Inpyo, et al.
Published: (2026)

2D Instance Editing in 3D Space
by: Xie, Yuhuan, et al.
Published: (2025)

VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
by: Sun, Shangkun, et al.
Published: (2024)

Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing
by: He, Runze, et al.
Published: (2026)

EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement
by: Xu, Zitong, et al.
Published: (2026)

MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM
by: Dong, Bowen, et al.
Published: (2025)

MIRAGE: Multimodal foundation model and benchmark for comprehensive retinal OCT image analysis
by: Morano, José, et al.
Published: (2025)

Cluster-Level Sparse Multi-Instance Learning for Whole-Slide Images
by: Zhang, Yuedi, et al.
Published: (2025)

Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis
by: Ni, Ziqian, et al.
Published: (2025)

ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
by: Pan, Yulin, et al.
Published: (2025)

Reasoning to Align: Implicit Reasoning in Diffusion Transformers for Video Editing
by: Li, Yan, et al.
Published: (2026)

MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
by: Dongre, Vardhan, et al.
Published: (2025)

M$^{3}$T2IBench: A Large-Scale Multi-Category, Multi-Instance, Multi-Relation Text-to-Image Benchmark
by: Zhang, Huixuan, et al.
Published: (2025)

MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
by: Zhu, Hongyang, et al.
Published: (2025)

Multi-turn Consistent Image Editing
by: Zhou, Zijun, et al.
Published: (2025)

MIRAGE: Model-agnostic Industrial Realistic Anomaly Generation and Evaluation for Visual Anomaly Detection
by: Hu, Jinwei, et al.
Published: (2026)

SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images
by: Feng, Pengming, et al.
Published: (2024)

MIRAGE: Multimodal Identification and Recognition of Annotations in Indian General Prescriptions
by: Mankash, Tavish, et al.
Published: (2024)

FINER: MLLMs Hallucinate under Fine-grained Negative Queries
by: Xiao, Rui, et al.
Published: (2026)