Saved in:
| Main Authors: | Xu, Shusong, Liu, Peiye |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.19085 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Patch-wise Auto-Encoder for Visual Anomaly Detection
by: Cui, Yajie, et al.
Published: (2023)
by: Cui, Yajie, et al.
Published: (2023)
Cluster and Predict Latent Patches for Improved Masked Image Modeling
by: Darcet, Timothée, et al.
Published: (2025)
by: Darcet, Timothée, et al.
Published: (2025)
HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance
by: Zhu, Junzhe, et al.
Published: (2023)
by: Zhu, Junzhe, et al.
Published: (2023)
General Purpose Image Encoder DINOv2 for Medical Image Registration
by: Song, Xinrui, et al.
Published: (2024)
by: Song, Xinrui, et al.
Published: (2024)
Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks
by: Mehrotra, Ayushi, et al.
Published: (2025)
by: Mehrotra, Ayushi, et al.
Published: (2025)
Symmetric masking strategy enhances the performance of Masked Image Modeling
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
CertMask: Certifiable Defense Against Adversarial Patches via Theoretically Optimal Mask Coverage
by: Lyu, Xuntao, et al.
Published: (2025)
by: Lyu, Xuntao, et al.
Published: (2025)
PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
by: Kim, Jeongho, et al.
Published: (2024)
by: Kim, Jeongho, et al.
Published: (2024)
Diffusion Model Patching via Mixture-of-Prompts
by: Ham, Seokil, et al.
Published: (2024)
by: Ham, Seokil, et al.
Published: (2024)
GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder
by: Chen, Junzhou, et al.
Published: (2024)
by: Chen, Junzhou, et al.
Published: (2024)
Pixel-Aligned Multi-View Generation with Depth Guided Decoder
by: Tang, Zhenggang, et al.
Published: (2024)
by: Tang, Zhenggang, et al.
Published: (2024)
Refer to Any Segmentation Mask Group With Vision-Language Prompts
by: Cao, Shengcao, et al.
Published: (2025)
by: Cao, Shengcao, et al.
Published: (2025)
One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation
by: Gao, Yuan, et al.
Published: (2025)
by: Gao, Yuan, et al.
Published: (2025)
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
by: Xu, Xiaoyun, et al.
Published: (2023)
by: Xu, Xiaoyun, et al.
Published: (2023)
MCGM: Mask Conditional Text-to-Image Generative Model
by: Skaik, Rami, et al.
Published: (2024)
by: Skaik, Rami, et al.
Published: (2024)
Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
by: Wang, Ziming, et al.
Published: (2023)
by: Wang, Ziming, et al.
Published: (2023)
Layout-Conditioned Autoregressive Text-to-Image Generation via Structured Masking
by: Zheng, Zirui, et al.
Published: (2025)
by: Zheng, Zirui, et al.
Published: (2025)
Car Damage Detection and Patch-to-Patch Self-supervised Image Alignment
by: Chen, Hanxiao
Published: (2024)
by: Chen, Hanxiao
Published: (2024)
SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
by: Lee, Jaeseong, et al.
Published: (2024)
by: Lee, Jaeseong, et al.
Published: (2024)
IPDN: Image-enhanced Prompt Decoding Network for 3D Referring Expression Segmentation
by: Chen, Qi, et al.
Published: (2025)
by: Chen, Qi, et al.
Published: (2025)
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
by: Koch, Paul, et al.
Published: (2025)
by: Koch, Paul, et al.
Published: (2025)
Dynamic Prompt Optimizing for Text-to-Image Generation
by: Mo, Wenyi, et al.
Published: (2024)
by: Mo, Wenyi, et al.
Published: (2024)
Near, far: Patch-ordering enhances vision foundation models' scene understanding
by: Pariza, Valentinos, et al.
Published: (2024)
by: Pariza, Valentinos, et al.
Published: (2024)
Dual form Complementary Masking for Domain-Adaptive Image Segmentation
by: Wang, Jiawen, et al.
Published: (2025)
by: Wang, Jiawen, et al.
Published: (2025)
HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
by: Kumbong, Hermann, et al.
Published: (2025)
by: Kumbong, Hermann, et al.
Published: (2025)
AEMIM: Adversarial Examples Meet Masked Image Modeling
by: Xiang, Wenzhao, et al.
Published: (2024)
by: Xiang, Wenzhao, et al.
Published: (2024)
Detecting AI-Generated Images via Contextual Anomaly Estimation in Masked AutoEncoders
by: Jang, Minsuk, et al.
Published: (2025)
by: Jang, Minsuk, et al.
Published: (2025)
Patch Progression Masked Autoencoder with Fusion CNN Network for Classifying Evolution Between Two Pairs of 2D OCT Slices
by: Zhang, Philippe, et al.
Published: (2025)
by: Zhang, Philippe, et al.
Published: (2025)
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
by: Yariv, Guy, et al.
Published: (2025)
by: Yariv, Guy, et al.
Published: (2025)
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)
by: Liu, Haozhe, et al.
Published: (2024)
HU-based Foreground Masking for 3D Medical Masked Image Modeling
by: Lee, Jin, et al.
Published: (2025)
by: Lee, Jin, et al.
Published: (2025)
Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
by: Hu, Jason, et al.
Published: (2024)
by: Hu, Jason, et al.
Published: (2024)
Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
by: Ye, Wei, et al.
Published: (2024)
by: Ye, Wei, et al.
Published: (2024)
TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment
by: Yuan, Jiquan, et al.
Published: (2024)
by: Yuan, Jiquan, et al.
Published: (2024)
PhyPrompt: RL-based Prompt Refinement for Physically Plausible Text-to-Video Generation
by: Wu, Shang, et al.
Published: (2026)
by: Wu, Shang, et al.
Published: (2026)
MaskAnyNet: Rethinking Masked Image Regions as Valuable Information in Supervised Learning
by: Hong, Jingshan, et al.
Published: (2025)
by: Hong, Jingshan, et al.
Published: (2025)
TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
by: Jiang, Chaoya, et al.
Published: (2023)
by: Jiang, Chaoya, et al.
Published: (2023)
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
by: Wu, Yecheng, et al.
Published: (2025)
by: Wu, Yecheng, et al.
Published: (2025)
Reverse Prompt: Cracking the Recipe Inside Text-to-Image Generation
by: Ren, Zhiyao, et al.
Published: (2025)
by: Ren, Zhiyao, et al.
Published: (2025)
Long-Text-to-Image Generation via Compositional Prompt Decomposition
by: Huang, Jen-Yuan, et al.
Published: (2026)
by: Huang, Jen-Yuan, et al.
Published: (2026)
Similar Items
-
Patch-wise Auto-Encoder for Visual Anomaly Detection
by: Cui, Yajie, et al.
Published: (2023) -
Cluster and Predict Latent Patches for Improved Masked Image Modeling
by: Darcet, Timothée, et al.
Published: (2025) -
HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance
by: Zhu, Junzhe, et al.
Published: (2023) -
General Purpose Image Encoder DINOv2 for Medical Image Registration
by: Song, Xinrui, et al.
Published: (2024) -
Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks
by: Mehrotra, Ayushi, et al.
Published: (2025)