:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Snyder, Thomas, Yang, H. Lexie, Schnake, Stefan, Schotthöfer, Steffen
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.08882
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
by: Schotthöfer, Steffen, et al.
Published: (2025)

Global Context Compression with Interleaved Vision-Text Transformation
by: Jiao, Dian, et al.
Published: (2026)

Proximal Vision Transformer: Enhancing Feature Representation through Two-Stage Manifold Geometry
by: Yun, Haoyu, et al.
Published: (2025)

ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
by: Karmore, Aryan
Published: (2026)

Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models
by: Yang, Yongjin, et al.
Published: (2023)

Lossy Neural Compression for Geospatial Analytics: A Review
by: Gomes, Carlos, et al.
Published: (2025)

Deep Extrinsic Manifold Representation for Vision Tasks
by: Zhang, Tongtong, et al.
Published: (2024)

Patch Rebirth: Toward Fast and Transferable Model Inversion of Vision Transformers
by: Heo, Seongsoo, et al.
Published: (2025)

EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients
by: Wu, Meihan, et al.
Published: (2024)

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
by: Nedungadi, Vishal, et al.
Published: (2024)

Learning to Transform Dynamically for Better Adversarial Transferability
by: Zhu, Rongyi, et al.
Published: (2024)

Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
by: Zheng, Weijie, et al.
Published: (2024)

End-to-End Optimized Image Compression with the Frequency-Oriented Transform
by: Zhang, Yuefeng, et al.
Published: (2024)

Vision Bridge Transformer at Scale
by: Tan, Zhenxiong, et al.
Published: (2025)

FlattenGPT: Depth Compression for Transformer with Layer Flattening
by: Xu, Ruihan, et al.
Published: (2026)

Attention Retention for Continual Learning with Vision Transformers
by: Lu, Yue, et al.
Published: (2026)

Smartflow: Enabling Scalable Spatiotemporal Geospatial Research
by: McVicar, David, et al.
Published: (2025)

Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
by: Dong, Wei, et al.
Published: (2024)

OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery
by: Dias, Philipe, et al.
Published: (2024)

Bi-Orthogonal Factor Decomposition for Vision Transformers
by: Doshi, Fenil R., et al.
Published: (2026)

Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization
by: Fel, Thomas, et al.
Published: (2023)

Spiking Vision Transformer with Saccadic Attention
by: Wang, Shuai, et al.
Published: (2025)

Vision Transformers for Zero-Shot Clustering of Animal Images: A Comparative Benchmarking Study
by: Markoff, Hugo, et al.
Published: (2026)

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
by: Yang, Yang, et al.
Published: (2026)

Manifold-Aware Exploration for Reinforcement Learning in Video Generation
by: Zheng, Mingzhe, et al.
Published: (2026)

Multi-Context Fusion Transformer for Pedestrian Crossing Intention Prediction in Urban Environments
by: Li, Yuanzhe, et al.
Published: (2025)

Towards Lossless Ultimate Vision Token Compression for VLMs
by: Zheng, Dehua, et al.
Published: (2025)

Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
by: Azizi, Seyedarmin, et al.
Published: (2024)

TVE: Learning Meta-attribution for Transferable Vision Explainer
by: Wang, Guanchu, et al.
Published: (2023)

Vision without Images: End-to-End Computer Vision from Single Compressive Measurements
by: Pan, Fengpu, et al.
Published: (2025)

Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
by: Lin, Qinliang, et al.
Published: (2024)

Geometrically Constrained and Token-Based Probabilistic Spatial Transformers
by: Schmidt, Johann, et al.
Published: (2025)

Transfer Learning Applied to Computer Vision Problems: Survey on Current Progress, Limitations, and Opportunities
by: Panda, Aaryan, et al.
Published: (2024)

Understanding the Transfer Limits of Vision Foundation Models
by: Huang, Shiqi, et al.
Published: (2026)

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
by: Roberts, Jonathan, et al.
Published: (2023)

Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation
by: Hsieh, Patterson, et al.
Published: (2025)

Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
by: Si, Haozhe, et al.
Published: (2025)

Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers
by: Lee, Dong Hoon, et al.
Published: (2024)

Forensic License Plate Recognition with Compression-Informed Transformers
by: Moussa, Denise, et al.
Published: (2022)

PQV-Mobile: A Combined Pruning and Quantization Toolkit to Optimize Vision Transformers for Mobile Applications
by: Bhardwaj, Kshitij
Published: (2024)