:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xia, Guoxuan, Hanspal, Harleen, Tudosiu, Petru-Daniel, Zhang, Shifeng, Parisot, Sarah
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.15724
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
by: Dutt, Raman, et al.
Published: (2025)

Generating Compositional Scenes via Text-to-image RGBA Instance Generation
by: Fontanella, Alessandro, et al.
Published: (2024)

MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)

SceneForge: Structured World Supervision from 3D Interventions
by: Li, Jizhizi, et al.
Published: (2026)

An Extended Evaluation Split for DeepSpaceYoloDataset
by: Parisot, Olivier
Published: (2026)

Dynamic Mixture-of-Experts for Visual Autoregressive Model
by: Vincenti, Jort, et al.
Published: (2025)

Improving Object Detection via Local-global Contrastive Learning
by: Triantafyllidou, Danai, et al.
Published: (2024)

Detecting streaks in smart telescopes images with Deep Learning
by: Parisot, Olivier, et al.
Published: (2025)

Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
by: Franchi, Gianni, et al.
Published: (2024)

Robustness analysis of Deep Sky Objects detection models on HPC
by: Parisot, Olivier, et al.
Published: (2025)

HemBLIP: A Vision-Language Model for Interpretable Leukemia Cell Morphology Analysis
by: van Logtestijn, Julie, et al.
Published: (2026)

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation
by: Wang, Jiahao, et al.
Published: (2025)

Enhancing Shape Perception and Segmentation Consistency for Industrial Image Inspection
by: Mao, Guoxuan, et al.
Published: (2025)

Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look
by: Zhang, Yong, et al.
Published: (2024)

Test-time Controllable Image Generation by Explicit Spatial Constraint Enforcement
by: Zhang, Z., et al.
Published: (2025)

Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
by: Ren, Huan, et al.
Published: (2025)

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
by: Image Team, et al.
Published: (2025)

RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation
by: Pang, Lexi, et al.
Published: (2025)

Spatial-Aware Latent Initialization for Controllable Image Generation
by: Sun, Wenqiang, et al.
Published: (2024)

LesionTABE: Equitable AI for Skin Lesion Detection
by: Diaz, Rocio Mexia, et al.
Published: (2026)

PG-ControlNet: A Physics-Guided ControlNet for Generative Spatially Varying Image Deblurring
by: Motorcu, Hakki, et al.
Published: (2025)

Seal2Real: Prompt Prior Learning on Diffusion Model for Unsupervised Document Seal Data Generation and Realisation
by: Yan, Mingfu, et al.
Published: (2023)

Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
by: Xia, Guoxuan, et al.
Published: (2024)

Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation
by: Zhang, Qilong, et al.
Published: (2022)

SpatialLock: Precise Spatial Control in Text-to-Image Synthesis
by: Liu, Biao, et al.
Published: (2025)

Component Adaptive Clustering for Generalized Category Discovery
by: Yan, Mingfu, et al.
Published: (2025)

FSATFusion: Frequency-Spatial Attention Transformer for Infrared and Visible Image Fusion
by: Zhang, Tianpei, et al.
Published: (2025)

Intelligent Image Search Algorithms Fusing Visual Large Models
by: Wang, Kehan, et al.
Published: (2025)

StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation
by: He, Yinxi, et al.
Published: (2026)

Layout-Guided Controllable Pathology Image Generation with In-Context Diffusion Transformers
by: Shou, Yuntao, et al.
Published: (2026)

Step1X-Edit: A Practical Framework for General Image Editing
by: Liu, Shiyu, et al.
Published: (2025)

GenSpace: Benchmarking Spatially-Aware Image Generation
by: Wang, Zehan, et al.
Published: (2025)

GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation
by: Zhang, Zhengqiang, et al.
Published: (2025)

ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement
by: Yang, Yufeng, et al.
Published: (2026)

TIPS: Text-Image Pretraining with Spatial awareness
by: Maninis, Kevis-Kokitsi, et al.
Published: (2024)

Taming Transformer for Emotion-Controllable Talking Face Generation
by: Zhang, Ziqi, et al.
Published: (2025)

Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
by: Huang, Shaofei, et al.
Published: (2025)

Recursive Generalization Transformer for Image Super-Resolution
by: Chen, Zheng, et al.
Published: (2023)

MagicFight: Personalized Martial Arts Combat Video Generation
by: Huang, Jiancheng, et al.
Published: (2026)

GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers
by: Liang, Guang, et al.
Published: (2025)