Saved in:
| Main Authors: | Liu, Ting, Hu, Yue, Wu, Wansen, Wang, Youkai, Xu, Kai, Yin, Quanjun |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2311.17812 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
by: Liu, Ting, et al.
Published: (2024)
by: Liu, Ting, et al.
Published: (2024)
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
by: Liu, Ting, et al.
Published: (2024)
by: Liu, Ting, et al.
Published: (2024)
DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition
by: Lin, Jiaying, et al.
Published: (2026)
by: Lin, Jiaying, et al.
Published: (2026)
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
by: Shi, Liangtao, et al.
Published: (2025)
by: Shi, Liangtao, et al.
Published: (2025)
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
by: Liu, Ting, et al.
Published: (2024)
by: Liu, Ting, et al.
Published: (2024)
AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation
by: Guo, Wenxuan, et al.
Published: (2026)
by: Guo, Wenxuan, et al.
Published: (2026)
DAP-MAE: Domain-Adaptive Point Cloud Masked Autoencoder for Effective Cross-Domain Learning
by: Gao, Ziqi, et al.
Published: (2025)
by: Gao, Ziqi, et al.
Published: (2025)
Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation
by: Liu, Kuanghong, et al.
Published: (2025)
by: Liu, Kuanghong, et al.
Published: (2025)
FedDAP: Domain-Aware Prototype Learning for Federated Learning under Domain Shift
by: Le, Huy Q., et al.
Published: (2026)
by: Le, Huy Q., et al.
Published: (2026)
Transitive Vision-Language Prompt Learning for Domain Generalization
by: Wang, Liyuan, et al.
Published: (2024)
by: Wang, Liyuan, et al.
Published: (2024)
M2IST: Multi-Modal Interactive Side-Tuning for Efficient Referring Expression Comprehension
by: Liu, Xuyang, et al.
Published: (2024)
by: Liu, Xuyang, et al.
Published: (2024)
Turning Adaptation into Assets: Cross-Domain Bridging for Online Vision-Language Navigation
by: Hu, Zixuan, et al.
Published: (2026)
by: Hu, Zixuan, et al.
Published: (2026)
CityCube: Benchmarking Cross-view Spatial Reasoning on Vision-Language Models in Urban Environments
by: Xu, Haotian, et al.
Published: (2026)
by: Xu, Haotian, et al.
Published: (2026)
Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model
by: Yin, Junhui, et al.
Published: (2026)
by: Yin, Junhui, et al.
Published: (2026)
Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
by: Yin, Xiaojie, et al.
Published: (2025)
by: Yin, Xiaojie, et al.
Published: (2025)
Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration
by: Lei, Ting, et al.
Published: (2025)
by: Lei, Ting, et al.
Published: (2025)
Semantics-aware Motion Retargeting with Vision-Language Models
by: Zhang, Haodong, et al.
Published: (2023)
by: Zhang, Haodong, et al.
Published: (2023)
PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation
by: Wang, Sen, et al.
Published: (2025)
by: Wang, Sen, et al.
Published: (2025)
LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation
by: Yin, Pengwei, et al.
Published: (2024)
by: Yin, Pengwei, et al.
Published: (2024)
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts
by: Hong, Haodong, et al.
Published: (2024)
by: Hong, Haodong, et al.
Published: (2024)
DAP-LED: Learning Degradation-Aware Priors with CLIP for Joint Low-light Enhancement and Deblurring
by: Wang, Ling, et al.
Published: (2024)
by: Wang, Ling, et al.
Published: (2024)
In-context Prompt Learning for Test-time Vision Recognition with Frozen Vision-language Model
by: Yin, Junhui, et al.
Published: (2024)
by: Yin, Junhui, et al.
Published: (2024)
Modeling Variants of Prompts for Vision-Language Models
by: Li, Ao, et al.
Published: (2025)
by: Li, Ao, et al.
Published: (2025)
Integrated Structural Prompt Learning for Vision-Language Models
by: Wang, Jiahui, et al.
Published: (2025)
by: Wang, Jiahui, et al.
Published: (2025)
Domain-Invariant Prompt Learning for Vision-Language Models
by: Khoee, Arsham Gholamzadeh, et al.
Published: (2026)
by: Khoee, Arsham Gholamzadeh, et al.
Published: (2026)
TagaVLM: Topology-Aware Global Action Reasoning for Vision-Language Navigation
by: Liu, Jiaxing, et al.
Published: (2026)
by: Liu, Jiaxing, et al.
Published: (2026)
Dynamic Topology Awareness: Breaking the Granularity Rigidity in Vision-Language Navigation
by: Peng, Jiankun, et al.
Published: (2026)
by: Peng, Jiankun, et al.
Published: (2026)
Multi-modal Attribute Prompting for Vision-Language Models
by: Liu, Xin, et al.
Published: (2024)
by: Liu, Xin, et al.
Published: (2024)
VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
by: Cheng, Silin, et al.
Published: (2025)
by: Cheng, Silin, et al.
Published: (2025)
AVION: Aerial Vision-Language Instruction from Offline Teacher to Prompt-Tuned Network
by: Hu, Yu, et al.
Published: (2026)
by: Hu, Yu, et al.
Published: (2026)
Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments
by: Hong, Haodong, et al.
Published: (2024)
by: Hong, Haodong, et al.
Published: (2024)
UAOR: Uncertainty-aware Observation Reinjection for Vision-Language-Action Models
by: Yang, Jiabing, et al.
Published: (2026)
by: Yang, Jiabing, et al.
Published: (2026)
DAP: Diffusion-based Affordance Prediction for Multi-modality Storage
by: Chang, Haonan, et al.
Published: (2024)
by: Chang, Haonan, et al.
Published: (2024)
Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models
by: Liu, Xiao, et al.
Published: (2026)
by: Liu, Xiao, et al.
Published: (2026)
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
by: Wang, Xiangyu, et al.
Published: (2024)
by: Wang, Xiangyu, et al.
Published: (2024)
AeroDuo: Aerial Duo for UAV-based Vision and Language Navigation
by: Wu, Ruipu, et al.
Published: (2025)
by: Wu, Ruipu, et al.
Published: (2025)
Physical Prompt Injection Attacks on Large Vision-Language Models
by: Ling, Chen, et al.
Published: (2026)
by: Ling, Chen, et al.
Published: (2026)
Volumetric Environment Representation for Vision-Language Navigation
by: Liu, Rui, et al.
Published: (2024)
by: Liu, Rui, et al.
Published: (2024)
Vision-Language Navigation with Energy-Based Policy
by: Liu, Rui, et al.
Published: (2024)
by: Liu, Rui, et al.
Published: (2024)
LCGNav: Local Candidate-Aware Geometric Enhancement for General Topological Planning in Vision-Language Navigation
by: Peng, Jiankun, et al.
Published: (2026)
by: Peng, Jiankun, et al.
Published: (2026)
Similar Items
-
DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
by: Liu, Ting, et al.
Published: (2024) -
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
by: Liu, Ting, et al.
Published: (2024) -
DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition
by: Lin, Jiaying, et al.
Published: (2026) -
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
by: Shi, Liangtao, et al.
Published: (2025) -
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
by: Liu, Ting, et al.
Published: (2024)