Saved in:
| Main Authors: | Guo, Yangyang, Xu, Ziwei, Xu, Xilie, Wong, YongKang, Nie, Liqiang, Kankanhalli, Mohan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.15614 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
by: Guo, Yangyang, et al.
Published: (2024)
by: Guo, Yangyang, et al.
Published: (2024)
ELIP: Efficient Discriminative Language-Image Pre-training with Fewer Vision Tokens
by: Guo, Yangyang, et al.
Published: (2023)
by: Guo, Yangyang, et al.
Published: (2023)
LLMs Can Unlearn Refusal with Only 1,000 Benign Samples
by: Guo, Yangyang, et al.
Published: (2026)
by: Guo, Yangyang, et al.
Published: (2026)
Fair Deepfake Detectors Can Generalize
by: Cheng, Harry, et al.
Published: (2025)
by: Cheng, Harry, et al.
Published: (2025)
Strong Preferences Affect the Robustness of Preference Models and Value Alignment
by: Xu, Ziwei, et al.
Published: (2024)
by: Xu, Ziwei, et al.
Published: (2024)
Diffusion Facial Forgery Detection
by: Cheng, Harry, et al.
Published: (2024)
by: Cheng, Harry, et al.
Published: (2024)
UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models
by: Guo, Yangyang, et al.
Published: (2023)
by: Guo, Yangyang, et al.
Published: (2023)
Involuntary Jailbreak: On Self-Prompting Attacks
by: Guo, Yangyang, et al.
Published: (2025)
by: Guo, Yangyang, et al.
Published: (2025)
Do Vision-Language Transformers Exhibit Visual Commonsense? An Empirical Study of VCR
by: Li, Zhenyang, et al.
Published: (2024)
by: Li, Zhenyang, et al.
Published: (2024)
Reasoning LLMs are Wandering Solution Explorers
by: Lu, Jiahao, et al.
Published: (2025)
by: Lu, Jiahao, et al.
Published: (2025)
Hallucination is Inevitable: An Innate Limitation of Large Language Models
by: Xu, Ziwei, et al.
Published: (2024)
by: Xu, Ziwei, et al.
Published: (2024)
Bullying the Machine: How Personas Increase LLM Vulnerability
by: Xu, Ziwei, et al.
Published: (2025)
by: Xu, Ziwei, et al.
Published: (2025)
SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency
by: Guo, Yangyang, et al.
Published: (2024)
by: Guo, Yangyang, et al.
Published: (2024)
The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative
by: Tan, Zhen, et al.
Published: (2024)
by: Tan, Zhen, et al.
Published: (2024)
Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding
by: Zhang, Haoyu, et al.
Published: (2025)
by: Zhang, Haoyu, et al.
Published: (2025)
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
by: Xu, Jingwei, et al.
Published: (2024)
by: Xu, Jingwei, et al.
Published: (2024)
Cluster-based Graph Collaborative Filtering
by: Liu, Fan, et al.
Published: (2024)
by: Liu, Fan, et al.
Published: (2024)
Learning to Predict Gradients for Semi-Supervised Continual Learning
by: Luo, Yan, et al.
Published: (2022)
by: Luo, Yan, et al.
Published: (2022)
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
by: Guo, Xuyang, et al.
Published: (2025)
by: Guo, Xuyang, et al.
Published: (2025)
FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data
by: Xu, Binqian, et al.
Published: (2024)
by: Xu, Binqian, et al.
Published: (2024)
Joint Vision-Language Social Bias Removal for CLIP
by: Zhang, Haoyu, et al.
Published: (2024)
by: Zhang, Haoyu, et al.
Published: (2024)
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
by: Wu, Junda, et al.
Published: (2025)
by: Wu, Junda, et al.
Published: (2025)
Do Prompts Guarantee Safety? Mitigating Toxicity from LLM Generations through Subspace Intervention
by: Singh, Himanshu, et al.
Published: (2026)
by: Singh, Himanshu, et al.
Published: (2026)
OSGNet with MLLM Reranking @ Ego4D Episodic Memory Challenge 2026
by: Feng, Yisen, et al.
Published: (2026)
by: Feng, Yisen, et al.
Published: (2026)
Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models
by: Zhan, Yu-Wei, et al.
Published: (2023)
by: Zhan, Yu-Wei, et al.
Published: (2023)
EPD: Long-term Memory Extraction, Context-awared Planning and Multi-iteration Decision @ EgoPlan Challenge ICML 2024
by: Shi, Letian, et al.
Published: (2024)
by: Shi, Letian, et al.
Published: (2024)
VidHal: Benchmarking Temporal Hallucinations in Vision LLMs
by: Choong, Wey Yeh, et al.
Published: (2024)
by: Choong, Wey Yeh, et al.
Published: (2024)
Attribute-driven Disentangled Representation Learning for Multimodal Recommendation
by: Li, Zhenyang, et al.
Published: (2023)
by: Li, Zhenyang, et al.
Published: (2023)
Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models
by: Liu, Fan, et al.
Published: (2023)
by: Liu, Fan, et al.
Published: (2023)
MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
by: Pi, Renjie, et al.
Published: (2024)
by: Pi, Renjie, et al.
Published: (2024)
The Evolution of Video Anomaly Detection: A Unified Framework from DNN to MLLM
by: Gao, Shibo, et al.
Published: (2025)
by: Gao, Shibo, et al.
Published: (2025)
Enhanced MLLM Black-Box Jailbreaking Attacks and Defenses
by: Zhong, Xingwei, et al.
Published: (2025)
by: Zhong, Xingwei, et al.
Published: (2025)
Detecting Deepfakes via Hamiltonian Dynamics
by: Cheng, Harry, et al.
Published: (2026)
by: Cheng, Harry, et al.
Published: (2026)
RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
by: Nie, Yunshuang, et al.
Published: (2026)
by: Nie, Yunshuang, et al.
Published: (2026)
MimicParts: Part-aware Style Injection for Speech-Driven 3D Motion Generation
by: Liu, Lianlian, et al.
Published: (2025)
by: Liu, Lianlian, et al.
Published: (2025)
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
by: Wu, Diankun, et al.
Published: (2025)
by: Wu, Diankun, et al.
Published: (2025)
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
by: Xiong, Chuyan, et al.
Published: (2024)
by: Xiong, Chuyan, et al.
Published: (2024)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
by: Zou, Yueying, et al.
Published: (2025)
by: Zou, Yueying, et al.
Published: (2025)
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
by: Liu, Zuyan, et al.
Published: (2024)
by: Liu, Zuyan, et al.
Published: (2024)
Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression
by: Jiang, Jiarui, et al.
Published: (2025)
by: Jiang, Jiarui, et al.
Published: (2025)
Similar Items
-
The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense
by: Guo, Yangyang, et al.
Published: (2024) -
ELIP: Efficient Discriminative Language-Image Pre-training with Fewer Vision Tokens
by: Guo, Yangyang, et al.
Published: (2023) -
LLMs Can Unlearn Refusal with Only 1,000 Benign Samples
by: Guo, Yangyang, et al.
Published: (2026) -
Fair Deepfake Detectors Can Generalize
by: Cheng, Harry, et al.
Published: (2025) -
Strong Preferences Affect the Robustness of Preference Models and Value Alignment
by: Xu, Ziwei, et al.
Published: (2024)