:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liang, Guotao, Zhang, Baoquan, Wen, Zhiyuan, Han, Zihao, Ye, Yunming
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.12032
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Head-Aware Key-Value Compression for Efficient Autoregressive Image Generation
by: Liang, Guotao, et al.
Published: (2026)

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
by: Liang, Guotao, et al.
Published: (2025)

AsyncDSB: Schedule-Asynchronous Diffusion Schrödinger Bridge for Image Inpainting
by: Han, Zihao, et al.
Published: (2024)

SJD-PV: Speculative Jacobi Decoding with Phrase Verification for Autoregressive Image Generation
by: Yu, Zhehao, et al.
Published: (2026)

Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
by: Zhang, Baoquan, et al.
Published: (2024)

LG-VQ: Language-Guided Codebook Learning
by: Liang, Guotao, et al.
Published: (2024)

SJD-VP: Speculative Jacobi Decoding with Verification Prediction for Autoregressive Image Generation
by: Shan, Bingqi, et al.
Published: (2026)

HPCR: Holistic Proxy-based Contrastive Replay for Online Continual Learning
by: Lin, Huiwei, et al.
Published: (2023)

Prototype Optimization with Neural ODE for Few-Shot Learning
by: Zhang, Baoquan, et al.
Published: (2024)

MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information
by: Liang, Jiajun, et al.
Published: (2024)

S2FT: Parameter-Efficient Fine-Tuning in Sparse Spectrum Domain
by: Zhang, Baoquan, et al.
Published: (2026)

DeepSound-V1: Start to Think Step-by-Step in the Audio Generation from Videos
by: Liang, Yunming, et al.
Published: (2025)

DiffCast: A Unified Framework via Residual Diffusion for Precipitation Nowcasting
by: Yu, Demin, et al.
Published: (2023)

Improving Flexible Image Tokenizers for Autoregressive Image Generation
by: Fu, Zixuan, et al.
Published: (2026)

StyleMark: A Robust Watermarking Method for Art Style Images Against Black-Box Arbitrary Style Transfer
by: Zhang, Yunming, et al.
Published: (2024)

Morphing Tokens Draw Strong Masked Image Models
by: Kim, Taekyung, et al.
Published: (2023)

SeiT++: Masked Token Modeling Improves Storage-efficient Training
by: Lee, Minhyun, et al.
Published: (2023)

Trinity Detector:text-assisted and attention mechanisms based spectral fusion for diffusion generation image detection
by: Song, Jiawei, et al.
Published: (2024)

AVT2-DWF: Improving Deepfake Detection with Audio-Visual Fusion and Dynamic Weighting Strategies
by: Wang, Rui, et al.
Published: (2024)

Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
by: Wang, Ziming, et al.
Published: (2023)

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
by: Wu, Yecheng, et al.
Published: (2025)

MaskBit: Embedding-free Image Generation via Bit Tokens
by: Weber, Mark, et al.
Published: (2024)

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens
by: Kim, Dongwon, et al.
Published: (2025)

Improving Image De-raining Using Reference-Guided Transformers
by: Ye, Zihao, et al.
Published: (2024)

ReMoMask: Retrieval-Augmented Masked Motion Generation
by: Li, Zhengdao, et al.
Published: (2025)

The Less You Depend, The More You Learn: Synthesizing Novel Views from Sparse, Unposed Images with Minimal 3D Knowledge
by: Wang, Haoru, et al.
Published: (2025)

Vision Foundation Models as Generalist Tokenizers for Image Generation
by: Zheng, Anlin, et al.
Published: (2026)

GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation
by: Zhang, Zhengqiang, et al.
Published: (2025)

Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
by: Kilian, Maciej, et al.
Published: (2024)

Autoregressive Image Generation with Masked Bit Modeling
by: Yu, Qihang, et al.
Published: (2026)

Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction
by: Guo, Ziyao, et al.
Published: (2025)

Empowering LLMs to Understand and Generate Complex Vector Graphics
by: Xing, Ximing, et al.
Published: (2024)

ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models
by: Shen, Sipeng, et al.
Published: (2024)

ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
by: Wang, Lingfeng, et al.
Published: (2025)

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
by: Lee, Minhyun, et al.
Published: (2024)

Mask Image Watermarking
by: Hu, Runyi, et al.
Published: (2025)

Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models
by: Jiang, Longtao, et al.
Published: (2025)

NativeTok: Native Visual Tokenization for Improved Image Generation
by: Wu, Bin, et al.
Published: (2026)

Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds
by: Zeng, Hongliang, et al.
Published: (2024)

MLIP: Medical Language-Image Pre-training with Masked Local Representation Learning
by: Liu, Jiarun, et al.
Published: (2024)