Saved in:
| Main Authors: | Ren, Jie, Chen, Kangrui, Chen, Chen, Sehwag, Vikash, Xing, Yue, Tang, Jiliang, Lyu, Lingjuan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.13088 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SoMeLVLM: A Large Vision Language Model for Social Media Processing
by: Zhang, Xinnong, et al.
Published: (2024)
by: Zhang, Xinnong, et al.
Published: (2024)
Multimodal Large Language Models for Medicine: A Comprehensive Survey
by: Ye, Jiarui, et al.
Published: (2025)
by: Ye, Jiarui, et al.
Published: (2025)
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
by: Liu, Jing, et al.
Published: (2023)
by: Liu, Jing, et al.
Published: (2023)
Unified Hallucination Detection for Multimodal Large Language Models
by: Chen, Xiang, et al.
Published: (2024)
by: Chen, Xiang, et al.
Published: (2024)
ChemDFM-X: Towards Large Multimodal Model for Chemistry
by: Zhao, Zihan, et al.
Published: (2024)
by: Zhao, Zihan, et al.
Published: (2024)
Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding
by: Deng, Ailin, et al.
Published: (2024)
by: Deng, Ailin, et al.
Published: (2024)
LLaSO: A Foundational Framework for Reproducible Research in Large Language and Speech Model
by: Sun, Yirong, et al.
Published: (2025)
by: Sun, Yirong, et al.
Published: (2025)
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
by: Deng, Ailin, et al.
Published: (2025)
by: Deng, Ailin, et al.
Published: (2025)
ChartAdapter: Large Vision-Language Model for Chart Summarization
by: Xu, Peixin, et al.
Published: (2024)
by: Xu, Peixin, et al.
Published: (2024)
K-pop Lyric Translation: Dataset, Analysis, and Neural-Modelling
by: Kim, Haven, et al.
Published: (2023)
by: Kim, Haven, et al.
Published: (2023)
Can We Edit Multimodal Large Language Models?
by: Cheng, Siyuan, et al.
Published: (2023)
by: Cheng, Siyuan, et al.
Published: (2023)
Beyond the Leaderboard: Rethinking Medical Benchmarks for Large Language Models
by: Chen, Wenting, et al.
Published: (2025)
by: Chen, Wenting, et al.
Published: (2025)
Large Language Models for Computer-Aided Design: A Survey
by: Zhang, Licheng, et al.
Published: (2025)
by: Zhang, Licheng, et al.
Published: (2025)
Language Models as Black-Box Optimizers for Vision-Language Models
by: Liu, Shihong, et al.
Published: (2023)
by: Liu, Shihong, et al.
Published: (2023)
ModalImmune: Immunity Driven Unlearning via Self Destructive Training
by: Fu, Rong, et al.
Published: (2026)
by: Fu, Rong, et al.
Published: (2026)
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
by: Tian, Bozhong, et al.
Published: (2024)
by: Tian, Bozhong, et al.
Published: (2024)
Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models
by: Chen, Junjie, et al.
Published: (2025)
by: Chen, Junjie, et al.
Published: (2025)
Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?
by: Yao, Yang, et al.
Published: (2025)
by: Yao, Yang, et al.
Published: (2025)
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE
by: Chen, Junyi, et al.
Published: (2023)
by: Chen, Junyi, et al.
Published: (2023)
MUDI: A Multimodal Biomedical Dataset for Understanding Pharmacodynamic Drug-Drug Interactions
by: Ngo, Tung-Lam, et al.
Published: (2025)
by: Ngo, Tung-Lam, et al.
Published: (2025)
Doctor Sun: A Bilingual Multimodal Large Language Model for Biomedical AI
by: Xue, Dong, et al.
Published: (2025)
by: Xue, Dong, et al.
Published: (2025)
DLF: Disentangled-Language-Focused Multimodal Sentiment Analysis
by: Wang, Pan, et al.
Published: (2024)
by: Wang, Pan, et al.
Published: (2024)
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation
by: He, Zongtao, et al.
Published: (2023)
by: He, Zongtao, et al.
Published: (2023)
Multimodal Misinformation Detection using Large Vision-Language Models
by: Tahmasebi, Sahar, et al.
Published: (2024)
by: Tahmasebi, Sahar, et al.
Published: (2024)
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
by: Wang, Qingni, et al.
Published: (2024)
by: Wang, Qingni, et al.
Published: (2024)
RAP: Retrieval-Augmented Personalization for Multimodal Large Language Models
by: Hao, Haoran, et al.
Published: (2024)
by: Hao, Haoran, et al.
Published: (2024)
ChronusOmni: Improving Time Awareness of Omni Large Language Models
by: Chen, Yijing, et al.
Published: (2025)
by: Chen, Yijing, et al.
Published: (2025)
Where Do We Go from Here? Multi-scale Allocentric Relational Inference from Natural Spatial Descriptions
by: Paz-Argaman, Tzuf, et al.
Published: (2024)
by: Paz-Argaman, Tzuf, et al.
Published: (2024)
Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model
by: Niu, Fuqiang, et al.
Published: (2024)
by: Niu, Fuqiang, et al.
Published: (2024)
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
by: Zhang, Zilun, et al.
Published: (2023)
by: Zhang, Zilun, et al.
Published: (2023)
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
by: Zhang, Hanlei, et al.
Published: (2024)
by: Zhang, Hanlei, et al.
Published: (2024)
EEG2TEXT-CN: An Exploratory Study of Open-Vocabulary Chinese Text-EEG Alignment via Large Language Model and Contrastive Learning on ChineseEEG
by: Lu, Jacky Tai-Yu, et al.
Published: (2025)
by: Lu, Jacky Tai-Yu, et al.
Published: (2025)
Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion
by: Zhang, Yichi, et al.
Published: (2024)
by: Zhang, Yichi, et al.
Published: (2024)
Automating Steering for Safe Multimodal Large Language Models
by: Wu, Lyucheng, et al.
Published: (2025)
by: Wu, Lyucheng, et al.
Published: (2025)
POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering
by: Xu, Yichen, et al.
Published: (2025)
by: Xu, Yichen, et al.
Published: (2025)
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings
by: Lu, Weikai, et al.
Published: (2025)
by: Lu, Weikai, et al.
Published: (2025)
Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
by: Ma, Ziyang, et al.
Published: (2025)
by: Ma, Ziyang, et al.
Published: (2025)
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
by: Wang, Chenxi, et al.
Published: (2025)
by: Wang, Chenxi, et al.
Published: (2025)
KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context
by: Lee, Nahyun, et al.
Published: (2026)
by: Lee, Nahyun, et al.
Published: (2026)
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models
by: Chen, Yiming, et al.
Published: (2024)
by: Chen, Yiming, et al.
Published: (2024)
Similar Items
-
SoMeLVLM: A Large Vision Language Model for Social Media Processing
by: Zhang, Xinnong, et al.
Published: (2024) -
Multimodal Large Language Models for Medicine: A Comprehensive Survey
by: Ye, Jiarui, et al.
Published: (2025) -
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
by: Liu, Jing, et al.
Published: (2023) -
Unified Hallucination Detection for Multimodal Large Language Models
by: Chen, Xiang, et al.
Published: (2024) -
ChemDFM-X: Towards Large Multimodal Model for Chemistry
by: Zhao, Zihan, et al.
Published: (2024)