Saved in:
| Main Authors: | Gutiérrez, Juan, Gutiérrez-García, Victor, Blanco-Murillo, José Luis |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06912 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors
by: Zhang, Junyang, et al.
Published: (2025)
by: Zhang, Junyang, et al.
Published: (2025)
AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation
by: Zhang, Jian, et al.
Published: (2026)
by: Zhang, Jian, et al.
Published: (2026)
History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions
by: Salgado, Alberto G. Rodríguez
Published: (2026)
by: Salgado, Alberto G. Rodríguez
Published: (2026)
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
by: Wang, Hanyu, et al.
Published: (2024)
by: Wang, Hanyu, et al.
Published: (2024)
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
by: Kuang, Zhengfei, et al.
Published: (2024)
by: Kuang, Zhengfei, et al.
Published: (2024)
What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
by: Kang, Inha, et al.
Published: (2025)
by: Kang, Inha, et al.
Published: (2025)
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
by: Xu, Xuwei, et al.
Published: (2023)
by: Xu, Xuwei, et al.
Published: (2023)
Multi-level Matching Network for Multimodal Entity Linking
by: Hu, Zhiwei, et al.
Published: (2024)
by: Hu, Zhiwei, et al.
Published: (2024)
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
by: Xi, Yingjie, et al.
Published: (2025)
by: Xi, Yingjie, et al.
Published: (2025)
Frequency-Aware Token Reduction for Efficient Vision Transformer
by: Lee, Dong-Jae, et al.
Published: (2025)
by: Lee, Dong-Jae, et al.
Published: (2025)
LaneDiffusion: Improving Centerline Graph Learning via Prior Injected BEV Feature Generation
by: Wang, Zijie, et al.
Published: (2025)
by: Wang, Zijie, et al.
Published: (2025)
Auto-Regressive Surface Cutting
by: Li, Yang, et al.
Published: (2025)
by: Li, Yang, et al.
Published: (2025)
VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
by: Jiang, Pengfei, et al.
Published: (2025)
by: Jiang, Pengfei, et al.
Published: (2025)
EAvatar: Expression-Aware Head Avatar Reconstruction with Generative Geometry Priors
by: Zhang, Shikun, et al.
Published: (2025)
by: Zhang, Shikun, et al.
Published: (2025)
Token Pruning using a Lightweight Background Aware Vision Transformer
by: Sah, Sudhakar, et al.
Published: (2024)
by: Sah, Sudhakar, et al.
Published: (2024)
Bi-Anchor Interpolation Solver for Accelerating Generative Modeling
by: Chen, Hongxu, et al.
Published: (2026)
by: Chen, Hongxu, et al.
Published: (2026)
Cut to the Chase: Training-free Multimodal Summarization via Chain-of-Events
by: You, Xiaoxing, et al.
Published: (2026)
by: You, Xiaoxing, et al.
Published: (2026)
Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
by: Li, Yulin, et al.
Published: (2025)
by: Li, Yulin, et al.
Published: (2025)
Cut2Next: Generating Next Shot via In-Context Tuning
by: He, Jingwen, et al.
Published: (2025)
by: He, Jingwen, et al.
Published: (2025)
Integrating Prior Observations for Incremental 3D Scene Graph Prediction
by: Renz, Marian, et al.
Published: (2025)
by: Renz, Marian, et al.
Published: (2025)
MoCA-Video: Motion-Aware Concept Alignment for Consistent Video Editing
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
by: Wang, Xinhao, et al.
Published: (2026)
by: Wang, Xinhao, et al.
Published: (2026)
Panoramic Distortion-Aware Tokenization for Person Detection and Localization in Overhead Fisheye Images
by: Wakai, Nobuhiko, et al.
Published: (2025)
by: Wakai, Nobuhiko, et al.
Published: (2025)
ZSPAPrune: Zero-Shot Prompt-Aware Token Pruning for Vision-Language Models
by: Zhang, Pu, et al.
Published: (2025)
by: Zhang, Pu, et al.
Published: (2025)
PAR: Prompt-Aware Token Reduction Method for Efficient Large Multimodal Models
by: Liu, Yingen, et al.
Published: (2024)
by: Liu, Yingen, et al.
Published: (2024)
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
by: Tan, Zhentao, et al.
Published: (2024)
by: Tan, Zhentao, et al.
Published: (2024)
VisPCO: Visual Token Pruning Configuration Optimization via Budget-Aware Pareto-Frontier Learning for Vision-Language Models
by: Ji, Huawei, et al.
Published: (2026)
by: Ji, Huawei, et al.
Published: (2026)
Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors
by: Xia, Jiatong, et al.
Published: (2026)
by: Xia, Jiatong, et al.
Published: (2026)
PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior
by: Lee, Seunggwan, et al.
Published: (2025)
by: Lee, Seunggwan, et al.
Published: (2025)
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
by: Li, Ruineng, et al.
Published: (2025)
by: Li, Ruineng, et al.
Published: (2025)
Monocular Normal Estimation via Shading Sequence Estimation
by: Li, Zongrui, et al.
Published: (2026)
by: Li, Zongrui, et al.
Published: (2026)
Controllable Video Object Insertion via Multiview Priors
by: Qi, Xia, et al.
Published: (2026)
by: Qi, Xia, et al.
Published: (2026)
AdaTok: Adaptive Token Compression with Object-Aware Representations for Efficient Multimodal LLMs
by: Zhang, Xinliang, et al.
Published: (2025)
by: Zhang, Xinliang, et al.
Published: (2025)
Beyond Fixed Anchors: Precisely Erasing Concepts with Sibling Exclusive Counterparts
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation
by: Li, Yuejia, et al.
Published: (2026)
by: Li, Yuejia, et al.
Published: (2026)
UnfoldLDM: Degradation-Aware Unfolding with Iterative Latent Diffusion Priors for Blind Image Restoration
by: He, Chunming, et al.
Published: (2025)
by: He, Chunming, et al.
Published: (2025)
DAP-LED: Learning Degradation-Aware Priors with CLIP for Joint Low-light Enhancement and Deblurring
by: Wang, Ling, et al.
Published: (2024)
by: Wang, Ling, et al.
Published: (2024)
Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models
by: Zhang, Yue, et al.
Published: (2024)
by: Zhang, Yue, et al.
Published: (2024)
Augmented Structure Preserving Neural Networks for cell biomechanics
by: Olalla-Pombo, Juan, et al.
Published: (2025)
by: Olalla-Pombo, Juan, et al.
Published: (2025)
VDInstruct: Zero-Shot Key Information Extraction via Content-Aware Vision Tokenization
by: Nguyen, Son, et al.
Published: (2025)
by: Nguyen, Son, et al.
Published: (2025)
Similar Items
-
AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors
by: Zhang, Junyang, et al.
Published: (2025) -
AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation
by: Zhang, Jian, et al.
Published: (2026) -
History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions
by: Salgado, Alberto G. Rodríguez
Published: (2026) -
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
by: Wang, Hanyu, et al.
Published: (2024) -
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
by: Kuang, Zhengfei, et al.
Published: (2024)