Saved in:
| Main Authors: | Zheng, Baolin, Chen, Guanlin, Zhong, Hongqiong, Teng, Qingyang, Tan, Yingshui, Liu, Zhendong, Wang, Weixun, Liu, Jiaheng, Yang, Jian, Jing, Huiyun, Wei, Jincheng, Su, Wenbo, Zhu, Xiaoyong, Zheng, Bo, Zhang, Kaifu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.23793 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
by: Zheng, Baihui, et al.
Published: (2025)
by: Zheng, Baihui, et al.
Published: (2025)
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
by: Tan, Yingshui, et al.
Published: (2024)
by: Tan, Yingshui, et al.
Published: (2024)
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
by: Tan, Yingshui, et al.
Published: (2025)
by: Tan, Yingshui, et al.
Published: (2025)
PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
by: Liu, Zhendong, et al.
Published: (2024)
by: Liu, Zhendong, et al.
Published: (2024)
Safety Alignment for Vision Language Models
by: Liu, Zhendong, et al.
Published: (2024)
by: Liu, Zhendong, et al.
Published: (2024)
MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models
by: Xia, Yinan, et al.
Published: (2025)
by: Xia, Yinan, et al.
Published: (2025)
ProgCo: Program Helps Self-Correction of Large Language Models
by: Song, Xiaoshuai, et al.
Published: (2025)
by: Song, Xiaoshuai, et al.
Published: (2025)
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
by: He, Yancheng, et al.
Published: (2024)
by: He, Yancheng, et al.
Published: (2024)
ConceptGuard: Proactive Safety in Text-and-Image-to-Video Generation through Multimodal Risk Detection
by: Ma, Ruize, et al.
Published: (2025)
by: Ma, Ruize, et al.
Published: (2025)
Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
by: Luo, Yijia, et al.
Published: (2025)
by: Luo, Yijia, et al.
Published: (2025)
QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems
by: Yang, Yiliu, et al.
Published: (2025)
by: Yang, Yiliu, et al.
Published: (2025)
Think-J: Learning to Think for Generative LLM-as-a-Judge
by: Huang, Hui, et al.
Published: (2025)
by: Huang, Hui, et al.
Published: (2025)
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
by: He, Yancheng, et al.
Published: (2025)
by: He, Yancheng, et al.
Published: (2025)
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
by: Li, Shilong, et al.
Published: (2024)
by: Li, Shilong, et al.
Published: (2024)
CE-RM: A Pointwise Generative Reward Model Optimized via Two-Stage Rollout and Unified Criteria
by: Hu, Xinyu, et al.
Published: (2026)
by: Hu, Xinyu, et al.
Published: (2026)
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden States
by: Jiang, Yilei, et al.
Published: (2025)
by: Jiang, Yilei, et al.
Published: (2025)
RapGuard: Safeguarding Multimodal Large Language Models via Rationale-aware Defensive Prompting
by: Jiang, Yilei, et al.
Published: (2024)
by: Jiang, Yilei, et al.
Published: (2024)
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
by: Zhang, Alexander, et al.
Published: (2025)
by: Zhang, Alexander, et al.
Published: (2025)
USB: Unified Synthetic Brain Framework for Bidirectional Pathology-Healthy Generation and Editing
by: Wang, Jun, et al.
Published: (2025)
by: Wang, Jun, et al.
Published: (2025)
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
by: Liu, Jianyu, et al.
Published: (2025)
by: Liu, Jianyu, et al.
Published: (2025)
"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models
by: Gu, Jihao, et al.
Published: (2025)
by: Gu, Jihao, et al.
Published: (2025)
Adaptive Segment-level Reward: Bridging the Gap Between Action and Reward Space in Alignment
by: Li, Yanshi, et al.
Published: (2024)
by: Li, Yanshi, et al.
Published: (2024)
EmergentBridge: Improving Zero-Shot Cross-Modal Transfer in Unified Multimodal Embedding Models
by: Xie, Jincheng, et al.
Published: (2026)
by: Xie, Jincheng, et al.
Published: (2026)
UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models
by: Lee, Segyu, et al.
Published: (2026)
by: Lee, Segyu, et al.
Published: (2026)
Comprehensive Review on Oleogels Structured by Lipid‐Based Compounds: From Structure Mechanisms to Nutritional Functionalities and Applications in Food Industry
by: Qianyu Le, et al.
Published: (2025)
by: Qianyu Le, et al.
Published: (2025)
One Sample to Rule Them All: Extreme Data Efficiency in Multidiscipline Reasoning with Reinforcement Learning
by: Li, Yiyuan, et al.
Published: (2026)
by: Li, Yiyuan, et al.
Published: (2026)
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
by: Xie, Wulin, et al.
Published: (2025)
by: Xie, Wulin, et al.
Published: (2025)
Molecular Epidemiology and Antimicrobial Resistance of Klebsiella pneumoniae Strains Isolated From Dairy Cows in Xinjiang, China
by: Kuojun Cai, et al.
Published: (2024)
by: Kuojun Cai, et al.
Published: (2024)
AIR: Complex Instruction Generation via Automatic Iterative Refinement
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
by: Bai, Ge, et al.
Published: (2024)
by: Bai, Ge, et al.
Published: (2024)
High-Fidelity Mural Restoration via a Unified Hybrid Mask-Aware Transformer
by: Jiang, Jincheng, et al.
Published: (2026)
by: Jiang, Jincheng, et al.
Published: (2026)
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark
by: Zou, Kai, et al.
Published: (2025)
by: Zou, Kai, et al.
Published: (2025)
El Ágora USB
Published: (2009)
Published: (2009)
iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance
by: Zheng, Jun, et al.
Published: (2026)
by: Zheng, Jun, et al.
Published: (2026)
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning
by: Liu, Zihe, et al.
Published: (2025)
by: Liu, Zihe, et al.
Published: (2025)
MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains
by: Xue, Leyan, et al.
Published: (2025)
by: Xue, Leyan, et al.
Published: (2025)
Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics
by: Nie, Hongyu, et al.
Published: (2025)
by: Nie, Hongyu, et al.
Published: (2025)
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
by: Lian, Zheng, et al.
Published: (2024)
by: Lian, Zheng, et al.
Published: (2024)
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
by: Wu, Yanan, et al.
Published: (2024)
by: Wu, Yanan, et al.
Published: (2024)
Complementary Reinforcement Learning
by: Muhtar, Dilxat, et al.
Published: (2026)
by: Muhtar, Dilxat, et al.
Published: (2026)
Similar Items
-
Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models
by: Zheng, Baihui, et al.
Published: (2025) -
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
by: Tan, Yingshui, et al.
Published: (2024) -
Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models
by: Tan, Yingshui, et al.
Published: (2025) -
PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment
by: Liu, Zhendong, et al.
Published: (2024) -
Safety Alignment for Vision Language Models
by: Liu, Zhendong, et al.
Published: (2024)