:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Skaik, Rami, Rossi, Leonardo, Fontanini, Tomaso, Prati, Andrea
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.00483
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Semantic Image Synthesis via Class-Adaptive Cross-Attention
by: Fontanini, Tomaso, et al.
Published: (2023)

MARS: Paying more attention to visual attributes for text-based person search
by: Ergasti, Alex, et al.
Published: (2024)

Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing
by: Rossi, Leonardo, et al.
Published: (2024)

Memory-augmented Online Video Anomaly Detection
by: Rossi, Leonardo, et al.
Published: (2023)

WaveMAE: Wavelet decomposition Masked Auto-Encoder for Remote Sensing
by: Bernuzzi, Vittorio, et al.
Published: (2025)

Mamba-ST: State Space Model for Efficient Style Transfer
by: Botti, Filippo, et al.
Published: (2024)

Adversarial Identity Injection for Semantic Face Image Synthesis
by: Tarollo, Giuseppe, et al.
Published: (2024)

Controllable Face Synthesis with Semantic Latent Diffusion Models
by: Ergasti, Alex, et al.
Published: (2024)

SISMA: Semantic Face Image Synthesis with Mamba
by: Botti, Filippo, et al.
Published: (2025)

U-Shape Mamba: State Space Model for faster diffusion
by: Ergasti, Alex, et al.
Published: (2025)

Layout-Conditioned Autoregressive Text-to-Image Generation via Structured Masking
by: Zheng, Zirui, et al.
Published: (2025)

$^R$FLAV: Rolling Flow matching for infinite Audio Video generation
by: Ergasti, Alex, et al.
Published: (2025)

Progressive Image Restoration via Text-Conditioned Video Generation
by: Kang, Peng, et al.
Published: (2025)

Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
by: Wang, Ziming, et al.
Published: (2023)

AnyMo: Scaling Any-Modality Conditional Motion Generation with Masked Modeling
by: Li, Yiheng, et al.
Published: (2026)

Self-Balanced R-CNN for Instance Segmentation
by: Rossi, Leonardo, et al.
Published: (2024)

Image Clustering Conditioned on Text Criteria
by: Kwon, Sehyun, et al.
Published: (2023)

VidVec: Unlocking Video MLLM Embeddings for Video-Text Retrieval
by: Tzachor, Issar, et al.
Published: (2026)

UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
by: Kang, Wonjun, et al.
Published: (2025)

LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing
by: Girella, Federico, et al.
Published: (2025)

Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
by: Zhang, Yu, et al.
Published: (2025)

MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
by: Wu, Lehong, et al.
Published: (2024)

Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AI
by: Martin, Valeria, et al.
Published: (2026)

MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs
by: Mao, Jiawei, et al.
Published: (2025)

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models
by: Kim, Hyungjin, et al.
Published: (2025)

Personalized Reward Modeling for Text-to-Image Generation
by: Lee, Jeongeun, et al.
Published: (2025)

HU-based Foreground Masking for 3D Medical Masked Image Modeling
by: Lee, Jin, et al.
Published: (2025)

Patch-enhanced Mask Encoder Prompt Image Generation
by: Xu, Shusong, et al.
Published: (2024)

Conditional Diffusion Model for Longitudinal Medical Image Generation
by: Dao, Duy-Phuong, et al.
Published: (2024)

CGI: Identifying Conditional Generative Models with Example Images
by: Zhou, Zhi, et al.
Published: (2025)

TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models
by: Sampaio, Georgia Gabriela, et al.
Published: (2024)

HARIVO: Harnessing Text-to-Image Models for Video Generation
by: Kwon, Mingi, et al.
Published: (2024)

On the Fairness, Diversity and Reliability of Text-to-Image Generative Models
by: Vice, Jordan, et al.
Published: (2024)

Interactive Visual Assessment for Text-to-Image Generation Models
by: Mi, Xiaoyue, et al.
Published: (2024)

Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
by: Zhao, Yu, et al.
Published: (2024)

Promptable Game Models: Text-Guided Game Simulation via Masked Diffusion Models
by: Menapace, Willi, et al.
Published: (2023)

AEMIM: Adversarial Examples Meet Masked Image Modeling
by: Xiang, Wenzhao, et al.
Published: (2024)

Model-based Cleaning of the QUILT-1M Pathology Dataset for Text-Conditional Image Synthesis
by: Aubreville, Marc, et al.
Published: (2024)

Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models
by: Li, Meiling, et al.
Published: (2024)

Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis
by: Chen, Muxi, et al.
Published: (2024)