Saved in:
| Main Authors: | Song, Jiayang, Huang, Yuheng, Zhou, Zhehua, Ma, Lei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.07342 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
by: Xie, Xuan, et al.
Published: (2024)
by: Xie, Xuan, et al.
Published: (2024)
TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models
by: Sun, Ruoyu, et al.
Published: (2025)
by: Sun, Ruoyu, et al.
Published: (2025)
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)
by: Wang, Zhijie, et al.
Published: (2024)
VLATest: Testing and Evaluating Vision-Language-Action Models for Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)
by: Wang, Zhijie, et al.
Published: (2024)
Evaluating LLMs on Sequential API Call Through Automated Test Generation
by: Huang, Yuheng, et al.
Published: (2025)
by: Huang, Yuheng, et al.
Published: (2025)
AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling
by: Huang, Yuheng, et al.
Published: (2024)
by: Huang, Yuheng, et al.
Published: (2024)
Look Before You Leap: An Exploratory Study of Uncertainty Measurement for Large Language Models
by: Huang, Yuheng, et al.
Published: (2023)
by: Huang, Yuheng, et al.
Published: (2023)
LeCov: Multi-level Testing Criteria for Large Language Models
by: Xie, Xuan, et al.
Published: (2024)
by: Xie, Xuan, et al.
Published: (2024)
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
by: Song, Da, et al.
Published: (2023)
by: Song, Da, et al.
Published: (2023)
Align Once, Benefit Multilingually: Enforcing Multilingual Consistency for LLM Safety Alignment
by: Bu, Yuyan, et al.
Published: (2026)
by: Bu, Yuyan, et al.
Published: (2026)
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
by: Zhou, Zhehua, et al.
Published: (2024)
by: Zhou, Zhehua, et al.
Published: (2024)
SafetyBench: Evaluating the Safety of Large Language Models
by: Zhang, Zhexin, et al.
Published: (2023)
by: Zhang, Zhexin, et al.
Published: (2023)
Evaluating Implicit Regulatory Compliance in LLM Tool Invocation via Logic-Guided Synthesis
by: Song, Da, et al.
Published: (2026)
by: Song, Da, et al.
Published: (2026)
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
by: Banerjee, Somnath, et al.
Published: (2025)
by: Banerjee, Somnath, et al.
Published: (2025)
Improving LLM Safety Alignment with Dual-Objective Optimization
by: Zhao, Xuandong, et al.
Published: (2025)
by: Zhao, Xuandong, et al.
Published: (2025)
Bridging the Multilingual Safety Divide: Efficient, Culturally-Aware Alignment for Global South Languages
by: Banerjee, Somnath, et al.
Published: (2026)
by: Banerjee, Somnath, et al.
Published: (2026)
Agent-SafetyBench: Evaluating the Safety of LLM Agents
by: Zhang, Zhexin, et al.
Published: (2024)
by: Zhang, Zhexin, et al.
Published: (2024)
Evaluation and Improvement of Fault Detection for Large Language Models
by: Hu, Qiang, et al.
Published: (2024)
by: Hu, Qiang, et al.
Published: (2024)
SLAM: Towards Efficient Multilingual Reasoning via Selective Language Alignment
by: Fan, Yuchun, et al.
Published: (2025)
by: Fan, Yuchun, et al.
Published: (2025)
MPO: Multilingual Safety Alignment via Reward Gap Optimization
by: Zhao, Weixiang, et al.
Published: (2025)
by: Zhao, Weixiang, et al.
Published: (2025)
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
by: Zhou, Hao, et al.
Published: (2024)
by: Zhou, Hao, et al.
Published: (2024)
Code-Switching Red-Teaming: LLM Evaluation for Safety and Multilingual Understanding
by: Yoo, Haneul, et al.
Published: (2024)
by: Yoo, Haneul, et al.
Published: (2024)
The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It
by: Yong, Zheng-Xin, et al.
Published: (2025)
by: Yong, Zheng-Xin, et al.
Published: (2025)
Multilingual Safety Alignment via Self-Distillation
by: Qin, Ruiyang, et al.
Published: (2026)
by: Qin, Ruiyang, et al.
Published: (2026)
Could Thinking Multilingually Empower LLM Reasoning?
by: Gao, Changjiang, et al.
Published: (2025)
by: Gao, Changjiang, et al.
Published: (2025)
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models
by: Huang, Xu, et al.
Published: (2025)
by: Huang, Xu, et al.
Published: (2025)
IndicSafe: A Benchmark for Evaluating Multilingual LLM Safety in South Asia
by: Pattnayak, Priyaranjan, et al.
Published: (2026)
by: Pattnayak, Priyaranjan, et al.
Published: (2026)
Language Model Alignment in Multilingual Trolley Problems
by: Jin, Zhijing, et al.
Published: (2024)
by: Jin, Zhijing, et al.
Published: (2024)
Towards Multilingual LLM Evaluation for European Languages
by: Thellmann, Klaudia, et al.
Published: (2024)
by: Thellmann, Klaudia, et al.
Published: (2024)
All Languages Matter: On the Multilingual Safety of Large Language Models
by: Wang, Wenxuan, et al.
Published: (2023)
by: Wang, Wenxuan, et al.
Published: (2023)
EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation
by: Tonja, Atnafu Lambebo, et al.
Published: (2024)
by: Tonja, Atnafu Lambebo, et al.
Published: (2024)
Multimodal Cultural Safety: Evaluation Framework and Alignment Strategies
by: Qiu, Haoyi, et al.
Published: (2025)
by: Qiu, Haoyi, et al.
Published: (2025)
The Perfect Blend: Redefining RLHF with Mixture of Judges
by: Xu, Tengyu, et al.
Published: (2024)
by: Xu, Tengyu, et al.
Published: (2024)
AlignX: Advancing Multilingual Large Language Models with Multilingual Representation Alignment
by: Bu, Mengyu, et al.
Published: (2025)
by: Bu, Mengyu, et al.
Published: (2025)
CM-Align: Consistency-based Multilingual Alignment for Large Language Models
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment
by: Chakraborty, Souradip, et al.
Published: (2025)
by: Chakraborty, Souradip, et al.
Published: (2025)
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
by: Yang, Junxiao, et al.
Published: (2026)
by: Yang, Junxiao, et al.
Published: (2026)
Multilingual != Multicultural: Evaluating Gaps Between Multilingual Capabilities and Cultural Alignment in LLMs
by: Rystrøm, Jonathan, et al.
Published: (2025)
by: Rystrøm, Jonathan, et al.
Published: (2025)
MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization
by: She, Shuaijie, et al.
Published: (2024)
by: She, Shuaijie, et al.
Published: (2024)
Safety Is Not Universal: The Selective Safety Trap in LLM Alignment
by: Brito, Iago Alves, et al.
Published: (2026)
by: Brito, Iago Alves, et al.
Published: (2026)
Similar Items
-
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
by: Xie, Xuan, et al.
Published: (2024) -
TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models
by: Sun, Ruoyu, et al.
Published: (2025) -
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024) -
VLATest: Testing and Evaluating Vision-Language-Action Models for Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024) -
Evaluating LLMs on Sequential API Call Through Automated Test Generation
by: Huang, Yuheng, et al.
Published: (2025)