Saved in:
| Main Authors: | Hamid, Kaiser, Cui, Can, Akbar, Khandakar Ashrafi, Wang, Ziran, Liang, Nade |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.12708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Interpretable Modeling of Driver Attention Shifts with a Vision--Language Model
by: Hamid, Kaiser, et al.
Published: (2025)
by: Hamid, Kaiser, et al.
Published: (2025)
ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
by: Hamid, Kaiser, et al.
Published: (2026)
by: Hamid, Kaiser, et al.
Published: (2026)
Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion
by: Zhang, Jiaru, et al.
Published: (2026)
by: Zhang, Jiaru, et al.
Published: (2026)
ViLaD: A Large Vision Language Diffusion Framework for End-to-End Autonomous Driving
by: Cui, Can, et al.
Published: (2025)
by: Cui, Can, et al.
Published: (2025)
ContextVLM: Zero-Shot and Few-Shot Context Understanding for Autonomous Driving using Vision Language Models
by: Sural, Shounak, et al.
Published: (2024)
by: Sural, Shounak, et al.
Published: (2024)
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
by: Li, Xudong, et al.
Published: (2024)
by: Li, Xudong, et al.
Published: (2024)
Semi-Supervised Few-Shot Adaptation of Vision-Language Models
by: Silva-Rodríguez, Julio, et al.
Published: (2026)
by: Silva-Rodríguez, Julio, et al.
Published: (2026)
Low-Rank Few-Shot Adaptation of Vision-Language Models
by: Zanella, Maxime, et al.
Published: (2024)
by: Zanella, Maxime, et al.
Published: (2024)
Revisiting Few-Shot Object Detection with Vision-Language Models
by: Madan, Anish, et al.
Published: (2023)
by: Madan, Anish, et al.
Published: (2023)
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
by: Ding, Kun, et al.
Published: (2024)
by: Ding, Kun, et al.
Published: (2024)
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models
by: Park, Sung-Yeon, et al.
Published: (2025)
by: Park, Sung-Yeon, et al.
Published: (2025)
Cross-Domain Few-Shot Learning via Multi-View Collaborative Optimization with Vision-Language Models
by: Chen, Dexia, et al.
Published: (2025)
by: Chen, Dexia, et al.
Published: (2025)
Few-Shot Adaptation Benchmark for Remote Sensing Vision-Language Models
by: Khoury, Karim El, et al.
Published: (2025)
by: Khoury, Karim El, et al.
Published: (2025)
Auxiliary Descriptive Knowledge for Few-Shot Adaptation of Vision-Language Model
by: Lee, SuBeen, et al.
Published: (2025)
by: Lee, SuBeen, et al.
Published: (2025)
Mechanistic Finetuning of Vision-Language-Action Models via Few-Shot Demonstrations
by: Mitra, Chancharik, et al.
Published: (2025)
by: Mitra, Chancharik, et al.
Published: (2025)
Few-Shot Learning from Gigapixel Images via Hierarchical Vision-Language Alignment and Modeling
by: Wong, Bryan, et al.
Published: (2025)
by: Wong, Bryan, et al.
Published: (2025)
Few-Shot Vision-Language Reasoning for Satellite Imagery via Verifiable Rewards
by: Koksal, Aybora, et al.
Published: (2025)
by: Koksal, Aybora, et al.
Published: (2025)
MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning
by: Fu, Haoyu, et al.
Published: (2025)
by: Fu, Haoyu, et al.
Published: (2025)
Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning
by: Jiang, Weihao, et al.
Published: (2024)
by: Jiang, Weihao, et al.
Published: (2024)
Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification
by: Wang, Zhongqi, et al.
Published: (2025)
by: Wang, Zhongqi, et al.
Published: (2025)
A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models
by: Silva-Rodríguez, Julio, et al.
Published: (2023)
by: Silva-Rodríguez, Julio, et al.
Published: (2023)
Vision-Language In-Context Learning Driven Few-Shot Visual Inspection Model
by: Ueno, Shiryu, et al.
Published: (2025)
by: Ueno, Shiryu, et al.
Published: (2025)
Efficient Few-Shot Continual Learning in Vision-Language Models
by: Panos, Aristeidis, et al.
Published: (2025)
by: Panos, Aristeidis, et al.
Published: (2025)
Preserve and Sculpt: Manifold-Aligned Fine-tuning of Vision-Language Models for Few-Shot Learning
by: Chen, Dexia, et al.
Published: (2025)
by: Chen, Dexia, et al.
Published: (2025)
ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection
by: Ma, Yunsheng, et al.
Published: (2022)
by: Ma, Yunsheng, et al.
Published: (2022)
Cluster-Aware Prompt Ensemble Learning for Few-Shot Vision-Language Model Adaptation
by: Chen, Zhi, et al.
Published: (2025)
by: Chen, Zhi, et al.
Published: (2025)
Towards Fine-Grained Vision-Language Alignment for Few-Shot Anomaly Detection
by: Fan, Yuanting, et al.
Published: (2025)
by: Fan, Yuanting, et al.
Published: (2025)
Enhancing Vision-Language Few-Shot Adaptation with Negative Learning
by: Zhang, Ce, et al.
Published: (2024)
by: Zhang, Ce, et al.
Published: (2024)
Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
by: Park, Keon-Hee, et al.
Published: (2024)
by: Park, Keon-Hee, et al.
Published: (2024)
Noise-Tolerant Few-Shot Unsupervised Adapter for Vision-Language Models
by: Ali, Eman, et al.
Published: (2023)
by: Ali, Eman, et al.
Published: (2023)
Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition
by: Guo, Hanyu, et al.
Published: (2024)
by: Guo, Hanyu, et al.
Published: (2024)
IKOD: Mitigating Visual Attention Degradation in Large Vision-Language Models
by: Yang, Jiabing, et al.
Published: (2025)
by: Yang, Jiabing, et al.
Published: (2025)
SIMSplat: Predictive Driving Scene Editing with Language-aligned 4D Gaussian Splatting
by: Park, Sung-Yeon, et al.
Published: (2025)
by: Park, Sung-Yeon, et al.
Published: (2025)
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models
by: Meng, Tian, et al.
Published: (2024)
by: Meng, Tian, et al.
Published: (2024)
Few-Shot Generative Model Adaption via Identity Injection and Preservation
by: He, Yeqi, et al.
Published: (2026)
by: He, Yeqi, et al.
Published: (2026)
Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture
by: Lu, Juanwu, et al.
Published: (2024)
by: Lu, Juanwu, et al.
Published: (2024)
Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning
by: Fuller, Harrison, et al.
Published: (2025)
by: Fuller, Harrison, et al.
Published: (2025)
DriveRX: A Vision-Language Reasoning Model for Cross-Task Autonomous Driving
by: Diao, Muxi, et al.
Published: (2025)
by: Diao, Muxi, et al.
Published: (2025)
SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning
by: Lai, Jinxiang, et al.
Published: (2023)
by: Lai, Jinxiang, et al.
Published: (2023)
Proto-CLIP: Vision-Language Prototypical Network for Few-Shot Learning
by: P, Jishnu Jaykumar, et al.
Published: (2023)
by: P, Jishnu Jaykumar, et al.
Published: (2023)
Similar Items
-
Interpretable Modeling of Driver Attention Shifts with a Vision--Language Model
by: Hamid, Kaiser, et al.
Published: (2025) -
ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
by: Hamid, Kaiser, et al.
Published: (2026) -
Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion
by: Zhang, Jiaru, et al.
Published: (2026) -
ViLaD: A Large Vision Language Diffusion Framework for End-to-End Autonomous Driving
by: Cui, Can, et al.
Published: (2025) -
ContextVLM: Zero-Shot and Few-Shot Context Understanding for Autonomous Driving using Vision Language Models
by: Sural, Shounak, et al.
Published: (2024)