Saved in:
| Main Authors: | Snyder, Thomas, Yang, H. Lexie, Schnake, Stefan, Schotthöfer, Steffen |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.08882 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
by: Schotthöfer, Steffen, et al.
Published: (2025)
by: Schotthöfer, Steffen, et al.
Published: (2025)
Global Context Compression with Interleaved Vision-Text Transformation
by: Jiao, Dian, et al.
Published: (2026)
by: Jiao, Dian, et al.
Published: (2026)
Proximal Vision Transformer: Enhancing Feature Representation through Two-Stage Manifold Geometry
by: Yun, Haoyu, et al.
Published: (2025)
by: Yun, Haoyu, et al.
Published: (2025)
ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
by: Karmore, Aryan
Published: (2026)
by: Karmore, Aryan
Published: (2026)
Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models
by: Yang, Yongjin, et al.
Published: (2023)
by: Yang, Yongjin, et al.
Published: (2023)
Lossy Neural Compression for Geospatial Analytics: A Review
by: Gomes, Carlos, et al.
Published: (2025)
by: Gomes, Carlos, et al.
Published: (2025)
Deep Extrinsic Manifold Representation for Vision Tasks
by: Zhang, Tongtong, et al.
Published: (2024)
by: Zhang, Tongtong, et al.
Published: (2024)
Patch Rebirth: Toward Fast and Transferable Model Inversion of Vision Transformers
by: Heo, Seongsoo, et al.
Published: (2025)
by: Heo, Seongsoo, et al.
Published: (2025)
EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients
by: Wu, Meihan, et al.
Published: (2024)
by: Wu, Meihan, et al.
Published: (2024)
MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning
by: Nedungadi, Vishal, et al.
Published: (2024)
by: Nedungadi, Vishal, et al.
Published: (2024)
Learning to Transform Dynamically for Better Adversarial Transferability
by: Zhu, Rongyi, et al.
Published: (2024)
by: Zhu, Rongyi, et al.
Published: (2024)
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
by: Zheng, Weijie, et al.
Published: (2024)
by: Zheng, Weijie, et al.
Published: (2024)
End-to-End Optimized Image Compression with the Frequency-Oriented Transform
by: Zhang, Yuefeng, et al.
Published: (2024)
by: Zhang, Yuefeng, et al.
Published: (2024)
Vision Bridge Transformer at Scale
by: Tan, Zhenxiong, et al.
Published: (2025)
by: Tan, Zhenxiong, et al.
Published: (2025)
FlattenGPT: Depth Compression for Transformer with Layer Flattening
by: Xu, Ruihan, et al.
Published: (2026)
by: Xu, Ruihan, et al.
Published: (2026)
Attention Retention for Continual Learning with Vision Transformers
by: Lu, Yue, et al.
Published: (2026)
by: Lu, Yue, et al.
Published: (2026)
Smartflow: Enabling Scalable Spatiotemporal Geospatial Research
by: McVicar, David, et al.
Published: (2025)
by: McVicar, David, et al.
Published: (2025)
Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
by: Dong, Wei, et al.
Published: (2024)
by: Dong, Wei, et al.
Published: (2024)
OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery
by: Dias, Philipe, et al.
Published: (2024)
by: Dias, Philipe, et al.
Published: (2024)
Bi-Orthogonal Factor Decomposition for Vision Transformers
by: Doshi, Fenil R., et al.
Published: (2026)
by: Doshi, Fenil R., et al.
Published: (2026)
Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization
by: Fel, Thomas, et al.
Published: (2023)
by: Fel, Thomas, et al.
Published: (2023)
Spiking Vision Transformer with Saccadic Attention
by: Wang, Shuai, et al.
Published: (2025)
by: Wang, Shuai, et al.
Published: (2025)
Vision Transformers for Zero-Shot Clustering of Animal Images: A Comparative Benchmarking Study
by: Markoff, Hugo, et al.
Published: (2026)
by: Markoff, Hugo, et al.
Published: (2026)
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
by: Yang, Yang, et al.
Published: (2026)
by: Yang, Yang, et al.
Published: (2026)
Manifold-Aware Exploration for Reinforcement Learning in Video Generation
by: Zheng, Mingzhe, et al.
Published: (2026)
by: Zheng, Mingzhe, et al.
Published: (2026)
Multi-Context Fusion Transformer for Pedestrian Crossing Intention Prediction in Urban Environments
by: Li, Yuanzhe, et al.
Published: (2025)
by: Li, Yuanzhe, et al.
Published: (2025)
Towards Lossless Ultimate Vision Token Compression for VLMs
by: Zheng, Dehua, et al.
Published: (2025)
by: Zheng, Dehua, et al.
Published: (2025)
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
by: Azizi, Seyedarmin, et al.
Published: (2024)
by: Azizi, Seyedarmin, et al.
Published: (2024)
TVE: Learning Meta-attribution for Transferable Vision Explainer
by: Wang, Guanchu, et al.
Published: (2023)
by: Wang, Guanchu, et al.
Published: (2023)
Vision without Images: End-to-End Computer Vision from Single Compressive Measurements
by: Pan, Fengpu, et al.
Published: (2025)
by: Pan, Fengpu, et al.
Published: (2025)
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
by: Lin, Qinliang, et al.
Published: (2024)
by: Lin, Qinliang, et al.
Published: (2024)
Geometrically Constrained and Token-Based Probabilistic Spatial Transformers
by: Schmidt, Johann, et al.
Published: (2025)
by: Schmidt, Johann, et al.
Published: (2025)
Transfer Learning Applied to Computer Vision Problems: Survey on Current Progress, Limitations, and Opportunities
by: Panda, Aaryan, et al.
Published: (2024)
by: Panda, Aaryan, et al.
Published: (2024)
Understanding the Transfer Limits of Vision Foundation Models
by: Huang, Shiqi, et al.
Published: (2026)
by: Huang, Shiqi, et al.
Published: (2026)
Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
by: Roberts, Jonathan, et al.
Published: (2023)
by: Roberts, Jonathan, et al.
Published: (2023)
Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation
by: Hsieh, Patterson, et al.
Published: (2025)
by: Hsieh, Patterson, et al.
Published: (2025)
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
by: Si, Haozhe, et al.
Published: (2025)
by: Si, Haozhe, et al.
Published: (2025)
Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers
by: Lee, Dong Hoon, et al.
Published: (2024)
by: Lee, Dong Hoon, et al.
Published: (2024)
Forensic License Plate Recognition with Compression-Informed Transformers
by: Moussa, Denise, et al.
Published: (2022)
by: Moussa, Denise, et al.
Published: (2022)
PQV-Mobile: A Combined Pruning and Quantization Toolkit to Optimize Vision Transformers for Mobile Applications
by: Bhardwaj, Kshitij
Published: (2024)
by: Bhardwaj, Kshitij
Published: (2024)
Similar Items
-
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
by: Schotthöfer, Steffen, et al.
Published: (2025) -
Global Context Compression with Interleaved Vision-Text Transformation
by: Jiao, Dian, et al.
Published: (2026) -
Proximal Vision Transformer: Enhancing Feature Representation through Two-Stage Manifold Geometry
by: Yun, Haoyu, et al.
Published: (2025) -
ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
by: Karmore, Aryan
Published: (2026) -
Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language Models
by: Yang, Yongjin, et al.
Published: (2023)