Saved in:
Bibliographic Details
Main Authors: Li, Yudong, Yang, Peiru, Huang, Feng, Yang, Zhongliang, Wang, Kecheng, Li, Haitian, Chen, Baocheng, An, Xingyu, Liu, Ziyu, Yang, Youdan, Chen, Kejiang, Wan, Sifang, Wang, Xu, Sun, Yufei, Wu, Liyan, Zhou, Ruiqi, Wen, Wenya, Gu, Xingchi, Zhang, Tianxin, Gao, Yue, Huang, Yongfeng
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.02366
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We introduce LiveSecBench, a continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench constructs a high-quality and unique dataset through a pipeline that combines automated generation with human verification. By periodically releasing new versions to expand the dataset and update evaluation metrics, LiveSecBench provides a robust and up-to-date standard for AI safety. In this report, we introduce our second release v251215, which evaluates across five dimensions (Public Safety, Fairness & Bias, Privacy, Truthfulness, and Mental Health Safety.) We evaluate 57 representative LLMs using an ELO rating system, offering a leaderboard of the current state of Chinese LLM safety. The result is available at https://livesecbench.intokentech.cn/.