:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fu, Guanyiman, Li, Jingtao, Cheng, Zihang, Li, Zhuanfeng, Chen, Diqi, Xu, Yan, Liu, Xiangyu, Xiong, Fengchao, Lu, Jianfeng, Chen, Chengrong, Zhou, Jun
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.17286
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising
by: Fu, Guanyiman, et al.
Published: (2024)

Hyperspectral Image Denoising via Spatial-Spectral Recurrent Transformer
by: Fu, Guanyiman, et al.
Published: (2023)

Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation
by: Zhang, Dingwen, et al.
Published: (2024)

VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Vision Backbones
by: Shen, Lefei, et al.
Published: (2025)

HSLiNets: Evaluating Band Ordering Strategies in Hyperspectral and LiDAR Fusion
by: Yang, Judy X, et al.
Published: (2025)

WS-Net: Weak-Signal Representation Learning and Gated Abundance Reconstruction for Hyperspectral Unmixing via State-Space and Weak Signal Attention Fusion
by: Long, Zekun, et al.
Published: (2026)

Iterative Low-rank Network for Hyperspectral Image Denoising
by: Ye, Jin, et al.
Published: (2025)

HyperFree: A Channel-adaptive and Tuning-free Foundation Model for Hyperspectral Remote Sensing Imagery
by: Li, Jingtao, et al.
Published: (2025)

Deep Equilibrium Convolutional Sparse Coding for Hyperspectral Image Denoising
by: Ye, Jin, et al.
Published: (2025)

Low-Rank Adaptation of Pre-trained Vision Backbones for Energy-Efficient Image Coding for Machine
by: Zhang, Yichi, et al.
Published: (2025)

MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture
by: Li, Guandong, et al.
Published: (2025)

ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training
by: Wang, Rongsheng, et al.
Published: (2023)

SUIT: Spatial-Spectral Union-Intersection Interaction Network for Hyperspectral Object Tracking
by: Xiong, Fengchao, et al.
Published: (2025)

Superpixel Semantics Representation and Pre-training for Vision-Language Task
by: Zhang, Siyu, et al.
Published: (2023)

UNIFORM: Unifying Knowledge from Large-scale and Diverse Pre-trained Models
by: Wang, Yimu, et al.
Published: (2025)

VLP: A Survey on Vision-Language Pre-training
by: Chen, Feilong, et al.
Published: (2022)

On the Limits of Token Reduction for Efficient Unified Vision Language Training
by: Chen, Siyi, et al.
Published: (2026)

HyperCap: Hyperspectral Land Cover Captioning Dataset for Vision Language Models
by: Das, Aryan, et al.
Published: (2025)

Continual Forgetting for Pre-trained Vision Models
by: Zhao, Hongbo, et al.
Published: (2024)

Split Adaptation for Pre-trained Vision Transformers
by: Wang, Lixu, et al.
Published: (2025)

COALA: A Practical and Vision-Centric Federated Learning Platform
by: Zhuang, Weiming, et al.
Published: (2024)

Boosting Vision Semantic Density with Anatomy Normality Modeling for Medical Vision-language Pre-training
by: Cao, Weiwei, et al.
Published: (2025)

Pre-Trained Vision Models as Perception Backbones for Safety Filters in Autonomous Driving
by: Yang, Yuxuan, et al.
Published: (2024)

Revisiting Continual Semantic Segmentation with Pre-trained Vision Models
by: Zhang, Duzhen, et al.
Published: (2025)

MulVuln: Enhancing Pre-trained LMs with Shared and Language-Specific Knowledge for Multilingual Vulnerability Detection
by: Nguyen, Van, et al.
Published: (2025)

Vision-LSTM: xLSTM as Generic Vision Backbone
by: Alkin, Benedikt, et al.
Published: (2024)

AffordanceLLM: Grounding Affordance from Vision Language Models
by: Qian, Shengyi, et al.
Published: (2024)

VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving
by: Zhang, Haiming, et al.
Published: (2024)

Centroid-centered Modeling for Efficient Vision Transformer Pre-training
by: Yan, Xin, et al.
Published: (2023)

Grounded Knowledge-Enhanced Medical Vision-Language Pre-training for Chest X-Ray
by: Deng, Qiao, et al.
Published: (2024)

VIP: Vision Instructed Pre-training for Robotic Manipulation
by: Li, Zhuoling, et al.
Published: (2024)

Multi-modal Vision Pre-training for Medical Image Analysis
by: Rui, Shaohao, et al.
Published: (2024)

MambaVision: A Hybrid Mamba-Transformer Vision Backbone
by: Hatamizadeh, Ali, et al.
Published: (2024)

Are Large Pre-trained Vision Language Models Effective Construction Safety Inspectors?
by: Chen, Xuezheng, et al.
Published: (2025)

Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction
by: Fu, Jiyuan, et al.
Published: (2024)

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis
by: Berrada, Tariq, et al.
Published: (2023)

UniCompress: Token Compression for Unified Vision-Language Understanding and Generation
by: Wang, Ziyao, et al.
Published: (2026)

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model
by: Wang, Di, et al.
Published: (2024)

Pre-trained Vision-Language Models Learn Discoverable Visual Concepts
by: Zang, Yuan, et al.
Published: (2024)

Do Pre-trained Vision-Language Models Encode Object States?
by: Newman, Kaleb, et al.
Published: (2024)