Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Liu, Feiyan, Zhao, Siyan, Zhuo, Chenxun, Liu, Tianming, Ge, Bao
Formato:	Preprint
Publicado:	2026
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2601.02830
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866918274962292736
author	Liu, Feiyan Zhao, Siyan Zhuo, Chenxun Liu, Tianming Ge, Bao
author_facet	Liu, Feiyan Zhao, Siyan Zhuo, Chenxun Liu, Tianming Ge, Bao
contents	Cultural backgrounds shape individuals' perspectives and approaches to problem-solving. Since the emergence of GPT-1 in 2018, large language models (LLMs) have undergone rapid development. To date, the world's ten leading LLM developers are primarily based in China and the United States. To examine whether LLMs released by Chinese and U.S. developers exhibit cultural differences in Chinese-language settings, we evaluate their performance on questions about Chinese culture. This study adopts a direct-questioning paradigm to evaluate models such as GPT-5.1, DeepSeek-V3.2, Qwen3-Max, and Gemini2.5Pro. We assess their understanding of traditional Chinese culture, including history, literature, poetry, and related domains. Comparative analyses between LLMs developed in China and the U.S. indicate that Chinese models generally outperform their U.S. counterparts on these tasks. Among U.S.-developed models, Gemini 2.5Pro and GPT-5.1 achieve relatively higher accuracy. The observed performance differences may potentially arise from variations in training data distribution, localization strategies, and the degree of emphasis on Chinese cultural content during model development.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_02830
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	The performances of the Chinese and U.S. Large Language Models on the Topic of Chinese Culture Liu, Feiyan Zhao, Siyan Zhuo, Chenxun Liu, Tianming Ge, Bao Computation and Language Cultural backgrounds shape individuals' perspectives and approaches to problem-solving. Since the emergence of GPT-1 in 2018, large language models (LLMs) have undergone rapid development. To date, the world's ten leading LLM developers are primarily based in China and the United States. To examine whether LLMs released by Chinese and U.S. developers exhibit cultural differences in Chinese-language settings, we evaluate their performance on questions about Chinese culture. This study adopts a direct-questioning paradigm to evaluate models such as GPT-5.1, DeepSeek-V3.2, Qwen3-Max, and Gemini2.5Pro. We assess their understanding of traditional Chinese culture, including history, literature, poetry, and related domains. Comparative analyses between LLMs developed in China and the U.S. indicate that Chinese models generally outperform their U.S. counterparts on these tasks. Among U.S.-developed models, Gemini 2.5Pro and GPT-5.1 achieve relatively higher accuracy. The observed performance differences may potentially arise from variations in training data distribution, localization strategies, and the degree of emphasis on Chinese cultural content during model development.
title	The performances of the Chinese and U.S. Large Language Models on the Topic of Chinese Culture
topic	Computation and Language
url	https://arxiv.org/abs/2601.02830

Ejemplares similares