Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Yudong, Yang, Peiru, Huang, Feng, Yang, Zhongliang, Wang, Kecheng, Li, Haitian, Chen, Baocheng, An, Xingyu, Liu, Ziyu, Yang, Youdan, Chen, Kejiang, Wan, Sifang, Wang, Xu, Sun, Yufei, Wu, Liyan, Zhou, Ruiqi, Wen, Wenya, Gu, Xingchi, Zhang, Tianxin, Gao, Yue, Huang, Yongfeng
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2511.02366
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

We introduce LiveSecBench, a continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench constructs a high-quality and unique dataset through a pipeline that combines automated generation with human verification. By periodically releasing new versions to expand the dataset and update evaluation metrics, LiveSecBench provides a robust and up-to-date standard for AI safety. In this report, we introduce our second release v251215, which evaluates across five dimensions (Public Safety, Fairness & Bias, Privacy, Truthfulness, and Mental Health Safety.) We evaluate 57 representative LLMs using an ELO rating system, offering a leaderboard of the current state of Chinese LLM safety. The result is available at https://livesecbench.intokentech.cn/.

Similar Items