:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Shusong, Liu, Peiye
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2405.19085
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Patch-wise Auto-Encoder for Visual Anomaly Detection
by: Cui, Yajie, et al.
Published: (2023)

Cluster and Predict Latent Patches for Improved Masked Image Modeling
by: Darcet, Timothée, et al.
Published: (2025)

HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance
by: Zhu, Junzhe, et al.
Published: (2023)

General Purpose Image Encoder DINOv2 for Medical Image Registration
by: Song, Xinrui, et al.
Published: (2024)

Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks
by: Mehrotra, Ayushi, et al.
Published: (2025)

Symmetric masking strategy enhances the performance of Masked Image Modeling
by: Nguyen, Khanh-Binh, et al.
Published: (2024)

CertMask: Certifiable Defense Against Adversarial Patches via Theoretically Optimal Mask Coverage
by: Lyu, Xuntao, et al.
Published: (2025)

PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
by: Kim, Jeongho, et al.
Published: (2024)

Diffusion Model Patching via Mixture-of-Prompts
by: Ham, Seokil, et al.
Published: (2024)

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder
by: Chen, Junzhou, et al.
Published: (2024)

Pixel-Aligned Multi-View Generation with Depth Guided Decoder
by: Tang, Zhenggang, et al.
Published: (2024)

Refer to Any Segmentation Mask Group With Vision-Language Prompts
by: Cao, Shengcao, et al.
Published: (2025)

One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
by: Gao, Yuan, et al.
Published: (2025)

MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
by: Xu, Xiaoyun, et al.
Published: (2023)

MCGM: Mask Conditional Text-to-Image Generative Model
by: Skaik, Rami, et al.
Published: (2024)

Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
by: Wang, Ziming, et al.
Published: (2023)

Layout-Conditioned Autoregressive Text-to-Image Generation via Structured Masking
by: Zheng, Zirui, et al.
Published: (2025)

Car Damage Detection and Patch-to-Patch Self-supervised Image Alignment
by: Chen, Hanxiao
Published: (2024)

SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
by: Lee, Jaeseong, et al.
Published: (2024)

IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
by: Chen, Qi, et al.
Published: (2025)

Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
by: Koch, Paul, et al.
Published: (2025)

Dynamic Prompt Optimizing for Text-to-Image Generation
by: Mo, Wenyi, et al.
Published: (2024)

Near, far: Patch-ordering enhances vision foundation models' scene understanding
by: Pariza, Valentinos, et al.
Published: (2024)

Dual form Complementary Masking for Domain-Adaptive Image Segmentation
by: Wang, Jiawen, et al.
Published: (2025)

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
by: Kumbong, Hermann, et al.
Published: (2025)

AEMIM: Adversarial Examples Meet Masked Image Modeling
by: Xiang, Wenzhao, et al.
Published: (2024)

Detecting AI-Generated Images via Contextual Anomaly Estimation in Masked AutoEncoders
by: Jang, Minsuk, et al.
Published: (2025)

Patch Progression Masked Autoencoder with Fusion CNN Network for Classifying Evolution Between Two Pairs of 2D OCT Slices
by: Zhang, Philippe, et al.
Published: (2025)

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
by: Yariv, Guy, et al.
Published: (2025)

MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)

HU-based Foreground Masking for 3D Medical Masked Image Modeling
by: Lee, Jin, et al.
Published: (2025)

Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
by: Hu, Jason, et al.
Published: (2024)

Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
by: Ye, Wei, et al.
Published: (2024)

TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment
by: Yuan, Jiquan, et al.
Published: (2024)

PhyPrompt: RL-based Prompt Refinement for Physically Plausible Text-to-Video Generation
by: Wu, Shang, et al.
Published: (2026)

MaskAnyNet: Rethinking Masked Image Regions as Valuable Information in Supervised Learning
by: Hong, Jingshan, et al.
Published: (2025)

TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
by: Jiang, Chaoya, et al.
Published: (2023)

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
by: Wu, Yecheng, et al.
Published: (2025)

Reverse Prompt: Cracking the Recipe Inside Text-to-Image Generation
by: Ren, Zhiyao, et al.
Published: (2025)

Long-Text-to-Image Generation via Compositional Prompt Decomposition
by: Huang, Jen-Yuan, et al.
Published: (2026)