Saved in:
| Main Authors: | Dai, Chengjie, Song, Tiantian, Tang, Hui, Chen, Fangdong, Yang, Bowei, Song, Guanghua |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.12923 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
by: Wu, Yecheng, et al.
Published: (2025)
by: Wu, Yecheng, et al.
Published: (2025)
Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression
by: Li, Tiantian, et al.
Published: (2025)
by: Li, Tiantian, et al.
Published: (2025)
Efficient Progressive Image Compression with Variance-aware Masking
by: Presta, Alberto, et al.
Published: (2024)
by: Presta, Alberto, et al.
Published: (2024)
Linear Attention Modeling for Learned Image Compression
by: Feng, Donghui, et al.
Published: (2025)
by: Feng, Donghui, et al.
Published: (2025)
MSCViT: A Small-size ViT architecture with Multi-Scale Self-Attention Mechanism for Tiny Datasets
by: Zhang, Bowei, et al.
Published: (2025)
by: Zhang, Bowei, et al.
Published: (2025)
Efficient Masked Autoencoders with Self-Consistency
by: Li, Zhaowen, et al.
Published: (2023)
by: Li, Zhaowen, et al.
Published: (2023)
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression
by: Mao, Weian, et al.
Published: (2026)
by: Mao, Weian, et al.
Published: (2026)
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
by: Zou, Siyu, et al.
Published: (2024)
by: Zou, Siyu, et al.
Published: (2024)
Reinforcement Learning Meets Masked Generative Models: Mask-GRPO for Text-to-Image Generation
by: Luo, Yifu, et al.
Published: (2025)
by: Luo, Yifu, et al.
Published: (2025)
SIDME: Self-supervised Image Demoiréing via Masked Encoder-Decoder Reconstruction
by: Wang, Xia, et al.
Published: (2025)
by: Wang, Xia, et al.
Published: (2025)
Exploring the Coordination of Frequency and Attention in Masked Image Modeling
by: Gui, Jie, et al.
Published: (2022)
by: Gui, Jie, et al.
Published: (2022)
MedVKAN: Efficient Feature Extraction with Mamba and KAN for Medical Image Segmentation
by: Zhu, Hancan, et al.
Published: (2025)
by: Zhu, Hancan, et al.
Published: (2025)
FairViT: Fair Vision Transformer via Adaptive Masking
by: Tian, Bowei, et al.
Published: (2024)
by: Tian, Bowei, et al.
Published: (2024)
GenCAMO: Scene-Graph Contextual Decoupling for Environment-aware and Mask-free Camouflage Image-Dense Annotation Generation
by: Chen, Chenglizhao, et al.
Published: (2026)
by: Chen, Chenglizhao, et al.
Published: (2026)
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining
by: Song, Chull Hwan, et al.
Published: (2024)
by: Song, Chull Hwan, et al.
Published: (2024)
CM-MaskSD: Cross-Modality Masked Self-Distillation for Referring Image Segmentation
by: Wang, Wenxuan, et al.
Published: (2023)
by: Wang, Wenxuan, et al.
Published: (2023)
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
by: You, Haoran, et al.
Published: (2022)
by: You, Haoran, et al.
Published: (2022)
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
by: Bai, Jinbin, et al.
Published: (2024)
by: Bai, Jinbin, et al.
Published: (2024)
Next-Frame Decoding for Ultra-Low-Bitrate Image Compression with Video Diffusion Priors
by: Chen, Yunuo, et al.
Published: (2026)
by: Chen, Yunuo, et al.
Published: (2026)
Polyline Path Masked Attention for Vision Transformer
by: Zhao, Zhongchen, et al.
Published: (2025)
by: Zhao, Zhongchen, et al.
Published: (2025)
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
by: Chen, Junyu, et al.
Published: (2024)
by: Chen, Junyu, et al.
Published: (2024)
Timestep-Aware Block Masking for Efficient Diffusion Model Inference
by: He, Haodong, et al.
Published: (2026)
by: He, Haodong, et al.
Published: (2026)
Extremely low-bitrate Image Compression Semantically Disentangled by LMMs from a Human Perception Perspective
by: Song, Juan, et al.
Published: (2025)
by: Song, Juan, et al.
Published: (2025)
BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model
by: Song, Yiran, et al.
Published: (2024)
by: Song, Yiran, et al.
Published: (2024)
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
by: Chen, Jun, et al.
Published: (2022)
by: Chen, Jun, et al.
Published: (2022)
Compress to Focus: Efficient Coordinate Compression for Policy Optimization in Multi-Turn GUI Agents
by: Song, Yurun, et al.
Published: (2026)
by: Song, Yurun, et al.
Published: (2026)
DCText: Scheduled Attention Masking for Visual Text Generation via Divide-and-Conquer Strategy
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking
by: Li, Yuheng, et al.
Published: (2024)
by: Li, Yuheng, et al.
Published: (2024)
Mask What Matters: Controllable Text-Guided Masking for Self-Supervised Medical Image Analysis
by: Wang, Ruilang, et al.
Published: (2025)
by: Wang, Ruilang, et al.
Published: (2025)
GaussianImage++: Boosted Image Representation and Compression with 2D Gaussian Splatting
by: Li, Tiantian, et al.
Published: (2025)
by: Li, Tiantian, et al.
Published: (2025)
TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation
by: Xia, Zunhui, et al.
Published: (2025)
by: Xia, Zunhui, et al.
Published: (2025)
PositionIC: Unified Position and Identity Consistency for Image Customization
by: Hu, Junjie, et al.
Published: (2025)
by: Hu, Junjie, et al.
Published: (2025)
EpiMask: Leveraging Epipolar Distance Based Masks in Cross-Attention for Satellite Image Matching
by: Deshmukh, Rahul, et al.
Published: (2026)
by: Deshmukh, Rahul, et al.
Published: (2026)
From Text to Mask: Localizing Entities Using the Attention of Text-to-Image Diffusion Models
by: Xiao, Changming, et al.
Published: (2023)
by: Xiao, Changming, et al.
Published: (2023)
Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image Collections
by: Bao, Yongtang, et al.
Published: (2025)
by: Bao, Yongtang, et al.
Published: (2025)
MMCLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training
by: Wu, Biao, et al.
Published: (2024)
by: Wu, Biao, et al.
Published: (2024)
Segmenting and Understanding: Region-aware Semantic Attention for Fine-grained Image Quality Assessment with Large Language Models
by: Song, Chenyue, et al.
Published: (2025)
by: Song, Chenyue, et al.
Published: (2025)
MAKIMA: Tuning-free Multi-Attribute Open-domain Video Editing via Mask-Guided Attention Modulation
by: Zheng, Haoyu, et al.
Published: (2024)
by: Zheng, Haoyu, et al.
Published: (2024)
$Δ$-AttnMask: Attention-Guided Masked Hidden States for Efficient Data Selection and Augmentation
by: Hu, Jucheng, et al.
Published: (2025)
by: Hu, Jucheng, et al.
Published: (2025)
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis
by: Zhuang, Jiaxin, et al.
Published: (2024)
by: Zhuang, Jiaxin, et al.
Published: (2024)
Similar Items
-
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
by: Wu, Yecheng, et al.
Published: (2025) -
Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression
by: Li, Tiantian, et al.
Published: (2025) -
Efficient Progressive Image Compression with Variance-aware Masking
by: Presta, Alberto, et al.
Published: (2024) -
Linear Attention Modeling for Learned Image Compression
by: Feng, Donghui, et al.
Published: (2025) -
MSCViT: A Small-size ViT architecture with Multi-Scale Self-Attention Mechanism for Tiny Datasets
by: Zhang, Bowei, et al.
Published: (2025)