:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Li, Agarwal, Shruti, Collomosse, John, Xie, Pengtao, Asnani, Vishal
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.19019
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ProMark: Proactive Diffusion Watermarking for Causal Attribution
by: Asnani, Vishal, et al.
Published: (2024)

On the Coexistence and Ensembling of Watermarks
by: Petrov, Aleksandar, et al.
Published: (2025)

Your Text Encoder Can Be An Object-Level Watermarking Controller
by: Devulapally, Naresh Kumar, et al.
Published: (2025)

MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
by: Kulthe, Yash, et al.
Published: (2025)

Tracing Hyperparameter Dependencies for Model Parsing via Learnable Graph Pooling Network
by: Guo, Xiao, et al.
Published: (2023)

TokenPure: Watermark Removal through Tokenized Appearance and Structural Guidance
by: Yang, Pei, et al.
Published: (2025)

Proactive Schemes: A Survey of Adversarial Attacks for Social Good
by: Asnani, Vishal, et al.
Published: (2024)

SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens
by: Zhang, Xiaoyan, et al.
Published: (2026)

FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
by: Cai, Kaitong, et al.
Published: (2025)

TokenLight: Precise Lighting Control in Images using Attribute Tokens
by: Chaturvedi, Sumit, et al.
Published: (2026)

TokenDial: Continuous Attribute Control in Text-to-Video via Spatiotemporal Token Offsets
by: Liu, Zhixuan, et al.
Published: (2026)

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
by: Dwivedi, Sai Kumar, et al.
Published: (2024)

BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation
by: Zhang, Li, et al.
Published: (2026)

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer
by: He, Hangzhou, et al.
Published: (2025)

MergeTok: Unified Continuous and Discrete Visual Tokenization via Token Merging
by: Zhang, Luyuan, et al.
Published: (2026)

Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
by: Xie, Qingsong, et al.
Published: (2025)

Video-KTR: Reinforcing Video Reasoning via Key Token Attribution
by: Wang, Ziyue, et al.
Published: (2026)

Not All Tokens Need 40 Steps: Heterogeneous Step Allocation in Diffusion Transformers for Efficient Video Generation
by: Chu, Ernie, et al.
Published: (2026)

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement
by: Lim, Habin, et al.
Published: (2025)

Do Vision Language Models Need to Process Image Tokens?
by: Ghosh, Sambit, et al.
Published: (2026)

Enhancing Multi-Image Understanding through Delimiter Token Scaling
by: Lee, Minyoung, et al.
Published: (2026)

TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection
by: Yu, Zifan, et al.
Published: (2023)

ClusterMark: Towards Robust Watermarking for Autoregressive Image Generators with Visual Token Clustering
by: Lukovnikov, Denis, et al.
Published: (2025)

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space
by: Garibi, Daniel, et al.
Published: (2025)

ViConEx-Med: Visual Concept Explainability via Multi-Concept Token Transformer for Medical Image Analysis
by: Patrício, Cristiano, et al.
Published: (2025)

ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization
by: Kim, Minseo, et al.
Published: (2026)

TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
by: Pan, Liang, et al.
Published: (2025)

UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
by: An, Ruichuan, et al.
Published: (2025)

EventPrune: Cascaded Event-Assisted Token Pruning for Efficient First-Person Dynamic Spatial Reasoning
by: Ma, Pengtao, et al.
Published: (2026)

Token Entropy Regularization for Multi-modal Antenna Affiliation Identification
by: Chen, Dong, et al.
Published: (2026)

PARASOL: Parametric Style Control for Diffusion Image Synthesis
by: Tarrés, Gemma Canet, et al.
Published: (2023)

Multitwine: Multi-Object Compositing with Text and Layout Control
by: Tarrés, Gemma Canet, et al.
Published: (2025)

Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
by: Chen, Yi, et al.
Published: (2024)

VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization
by: Liu, Tao, et al.
Published: (2024)

Multi-Token Enhancing for Vision Representation Learning
by: Li, Zhong-Yu, et al.
Published: (2024)

ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance
by: Li, Duo, et al.
Published: (2025)

Token Bottleneck: One Token to Remember Dynamics
by: Kim, Taekyung, et al.
Published: (2025)

Visual-Word Tokenizer: Beyond Fixed Sets of Tokens in Vision Transformers
by: Gee, Leonidas, et al.
Published: (2024)

ScribeTokens: Fixed-Vocabulary Tokenization of Digital Ink
by: Wang, Douglass
Published: (2026)

TrimTokenator: Towards Adaptive Visual Token Pruning for Large Multimodal Models
by: Zhang, Hao, et al.
Published: (2025)