:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yao, Jing, Yi, Xiaoyuan, Duan, Shitong, Wang, Jindong, Bai, Yuzhuo, Huang, Muhua, Zhang, Peng, Lu, Tun, Dou, Zhicheng, Sun, Maosong, Xie, Xing
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.07071
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference
by: Yao, Jing, et al.
Published: (2025)

IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
by: Bai, Yuzhuo, et al.
Published: (2025)

Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
by: Duan, Shitong, et al.
Published: (2023)

CAReDiO: Cultural Alignment via Representativeness and Distinctiveness Guided Data Optimization
by: Yao, Jing, et al.
Published: (2025)

On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity
by: Huang, Muhua, et al.
Published: (2025)

CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses
by: Yao, Jing, et al.
Published: (2024)

The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
by: Kim, HyunJin, et al.
Published: (2024)

Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches
by: Biedma, Pablo, et al.
Published: (2024)

Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights
by: Choi, Sooyung, et al.
Published: (2025)

Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference Optimization
by: Duan, Shitong, et al.
Published: (2024)

MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
by: Zhu, Yanxu, et al.
Published: (2025)

On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
by: Wang, Xinpeng, et al.
Published: (2024)

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
by: Lee, Jaehyeok, et al.
Published: (2026)

Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
by: Guo, Hanze, et al.
Published: (2025)

PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization
by: Jiang, Han, et al.
Published: (2025)

Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMs
by: Ma, Huanhuan, et al.
Published: (2025)

Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2026)

Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
by: Wang, Xingqi, et al.
Published: (2024)

MoVa: Towards Generalizable Classification of Human Morals and Values
by: Chen, Ziyu, et al.
Published: (2025)

Research Superalignment Should Advance Now with Alternating Competence and Conformity Optimization
by: Kim, HyunJin, et al.
Published: (2025)

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs
by: Shen, Hua, et al.
Published: (2024)

DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes
by: Cheng, Jiehan, et al.
Published: (2025)

LLM-GLOBE: A Benchmark Evaluating the Cultural Values Embedded in LLM Output
by: Karinshak, Elise, et al.
Published: (2024)

Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities
by: Zhang, Xiangxu, et al.
Published: (2026)

AI Evaluation Should Require Standardized Item-Level Data Releases
by: Jiang, Han, et al.
Published: (2026)

Computational Multi-Agents Society Experiments: Social Modeling Framework Based on Generative Agents
by: Zhang, Hanzhong, et al.
Published: (2025)

Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
by: Jiang, Han, et al.
Published: (2024)

Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense
by: Shen, Guobin, et al.
Published: (2025)

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
by: Cao, Maosong, et al.
Published: (2024)

OpenCompass: A Universal Evaluation Platform for Large Language Models
by: Cao, Maosong, et al.
Published: (2026)

Spatio-Temporal Autoregressions for High Dimensional Matrix-Valued Time Series
by: Dou, Baojun, et al.
Published: (2025)

The Transmission Value of Energy Storage and Fundamental Limitations
by: Zhang, Qian, et al.
Published: (2024)

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
by: Hu, Xixu, et al.
Published: (2024)

InFi-Check: Interpretable and Fine-Grained Fact-Checking of LLMs
by: Bai, Yuzhuo, et al.
Published: (2026)

Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
by: Wang, Jun, et al.
Published: (2025)

Flames: Benchmarking Value Alignment of LLMs in Chinese
by: Huang, Kexin, et al.
Published: (2023)

STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
by: Shen, Sicheng, et al.
Published: (2025)

Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
by: Bai, Xuehai, et al.
Published: (2026)

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
by: Dong, Haonan, et al.
Published: (2026)

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
by: Röttger, Paul, et al.
Published: (2024)