Saved in:
| Main Authors: | Liu, Zhendong, Nie, Yuanbi, Tan, Yingshui, Liu, Jiaheng, Yue, Xiangyu, Cui, Qiushi, Wang, Chongjun, Zhu, Xiaoyong, Zheng, Bo |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.11543 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Safety Alignment for Vision Language Models
by: Liu, Zhendong, et al.
Published: (2024)
by: Liu, Zhendong, et al.
Published: (2024)
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025)
by: Xia, Yinan, et al.
Published: (2025)
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
by: Tan, Yingshui, et al.
Published: (2025)
by: Tan, Yingshui, et al.
Published: (2025)
ConceptGuard: Proactive Safety in Text-and-Image-to-Video Generation through Multimodal Risk Detection
by: Ma, Ruize, et al.
Published: (2025)
by: Ma, Ruize, et al.
Published: (2025)
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States
by: Jiang, Yilei, et al.
Published: (2025)
by: Jiang, Yilei, et al.
Published: (2025)
QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems
by: Yang, Yiliu, et al.
Published: (2025)
by: Yang, Yiliu, et al.
Published: (2025)
RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting
by: Jiang, Yilei, et al.
Published: (2024)
by: Jiang, Yilei, et al.
Published: (2024)
USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models
by: Zheng, Baolin, et al.
Published: (2025)
by: Zheng, Baolin, et al.
Published: (2025)
Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
by: Zheng, Baihui, et al.
Published: (2025)
by: Zheng, Baihui, et al.
Published: (2025)
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
by: Liu, Jianyu, et al.
Published: (2025)
by: Liu, Jianyu, et al.
Published: (2025)
Improving Concept Alignment in Vision-Language Concept Bottleneck Models
by: Selvaraj, Nithish Muthuchamy, et al.
Published: (2024)
by: Selvaraj, Nithish Muthuchamy, et al.
Published: (2024)
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
by: Tan, Yingshui, et al.
Published: (2024)
by: Tan, Yingshui, et al.
Published: (2024)
VLM-Guard: Safeguarding Vision-Language Models via Fulfilling Safety Alignment Gap
by: Liu, Qin, et al.
Published: (2025)
by: Liu, Qin, et al.
Published: (2025)
Concepts Worth Having: Refining VLM-Guided Concept Bottleneck Models with Minimal Annotations
by: Debole, Nicola, et al.
Published: (2026)
by: Debole, Nicola, et al.
Published: (2026)
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
by: Yang, Junxiao, et al.
Published: (2026)
by: Yang, Junxiao, et al.
Published: (2026)
Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts
by: Tan, Andong, et al.
Published: (2024)
by: Tan, Andong, et al.
Published: (2024)
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
by: Du, Hao, et al.
Published: (2025)
by: Du, Hao, et al.
Published: (2025)
Downlink and Uplink NOMA-ISAC with Signal Alignment
by: Zhao, Boqun, et al.
Published: (2023)
by: Zhao, Boqun, et al.
Published: (2023)
DSCA: Dynamic Subspace Concept Alignment for Lifelong VLM Editing
by: Das, Gyanendra, et al.
Published: (2026)
by: Das, Gyanendra, et al.
Published: (2026)
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
by: Zhou, Zhanhui, et al.
Published: (2024)
by: Zhou, Zhanhui, et al.
Published: (2024)
"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models
by: Gu, Jihao, et al.
Published: (2025)
by: Gu, Jihao, et al.
Published: (2025)
Enhancing Concept Localization in CLIP-based Concept Bottleneck Models
by: Kazmierczak, Rémi, et al.
Published: (2025)
by: Kazmierczak, Rémi, et al.
Published: (2025)
ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding
by: Shi, Liang, et al.
Published: (2024)
by: Shi, Liang, et al.
Published: (2024)
V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer
by: He, Hangzhou, et al.
Published: (2025)
by: He, Hangzhou, et al.
Published: (2025)
FloorplanVLM: A Vision-Language Model for Floorplan Vectorization
by: Liu, Yuanqing, et al.
Published: (2026)
by: Liu, Yuanqing, et al.
Published: (2026)
RE-VLM: Event-Augmented Vision-Language Model for Scene Understanding
by: Liu, Hanqing, et al.
Published: (2026)
by: Liu, Hanqing, et al.
Published: (2026)
Bayesian Concept Bottleneck Models with LLM Priors
by: Feng, Jean, et al.
Published: (2024)
by: Feng, Jean, et al.
Published: (2024)
Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models
by: Xie, Yan, et al.
Published: (2025)
by: Xie, Yan, et al.
Published: (2025)
Bodhi VLM: Privacy-Alignment Modeling for Hierarchical Visual Representations in Vision Backbones and VLM Encoders via Bottom-Up and Top-Down Feature Search
by: Ma, Bo, et al.
Published: (2026)
by: Ma, Bo, et al.
Published: (2026)
Debugging Concept Bottleneck Models through Removal and Retraining
by: Enouen, Eric, et al.
Published: (2025)
by: Enouen, Eric, et al.
Published: (2025)
Guarding the Gate: ConceptGuard Battles Concept-Level Backdoors in Concept Bottleneck Models
by: Lai, Songning, et al.
Published: (2024)
by: Lai, Songning, et al.
Published: (2024)
CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models
by: Lai, Songning, et al.
Published: (2024)
by: Lai, Songning, et al.
Published: (2024)
Adaptive Segment-level Reward: Bridging the Gap Between Action and Reward Space in Alignment
by: Li, Yanshi, et al.
Published: (2024)
by: Li, Yanshi, et al.
Published: (2024)
There Was Never a Bottleneck in Concept Bottleneck Models
by: Almudévar, Antonio, et al.
Published: (2025)
by: Almudévar, Antonio, et al.
Published: (2025)
The Alignment Bottleneck
by: Cao, Wenjun
Published: (2025)
by: Cao, Wenjun
Published: (2025)
On the Concept Trustworthiness in Concept Bottleneck Models
by: Huang, Qihan, et al.
Published: (2024)
by: Huang, Qihan, et al.
Published: (2024)
Uncertainty-Aware Concept Bottleneck Models with Enhanced Interpretability
by: Zhang, Haifei, et al.
Published: (2025)
by: Zhang, Haifei, et al.
Published: (2025)
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
by: He, Yancheng, et al.
Published: (2024)
by: He, Yancheng, et al.
Published: (2024)
GraphVLM: Benchmarking Vision Language Models for Multimodal Graph Learning
by: Liu, Jiajin, et al.
Published: (2026)
by: Liu, Jiajin, et al.
Published: (2026)
PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality
by: Li, Nanxi, et al.
Published: (2025)
by: Li, Nanxi, et al.
Published: (2025)
Similar Items
-
Safety Alignment for Vision Language Models
by: Liu, Zhendong, et al.
Published: (2024) -
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025) -
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
by: Tan, Yingshui, et al.
Published: (2025) -
ConceptGuard: Proactive Safety in Text-and-Image-to-Video Generation through Multimodal Risk Detection
by: Ma, Ruize, et al.
Published: (2025) -
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States
by: Jiang, Yilei, et al.
Published: (2025)