Saved in:
| Main Authors: | Yao, Jing, Yi, Xiaoyuan, Duan, Shitong, Wang, Jindong, Bai, Yuzhuo, Huang, Muhua, Zhang, Peng, Lu, Tun, Dou, Zhicheng, Sun, Maosong, Xie, Xing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.07071 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference
by: Yao, Jing, et al.
Published: (2025)
by: Yao, Jing, et al.
Published: (2025)
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
by: Bai, Yuzhuo, et al.
Published: (2025)
by: Bai, Yuzhuo, et al.
Published: (2025)
Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
by: Duan, Shitong, et al.
Published: (2023)
by: Duan, Shitong, et al.
Published: (2023)
CAReDiO: Cultural Alignment via Representativeness and Distinctiveness Guided Data Optimization
by: Yao, Jing, et al.
Published: (2025)
by: Yao, Jing, et al.
Published: (2025)
On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity
by: Huang, Muhua, et al.
Published: (2025)
by: Huang, Muhua, et al.
Published: (2025)
CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses
by: Yao, Jing, et al.
Published: (2024)
by: Yao, Jing, et al.
Published: (2024)
The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment
by: Kim, HyunJin, et al.
Published: (2024)
by: Kim, HyunJin, et al.
Published: (2024)
Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches
by: Biedma, Pablo, et al.
Published: (2024)
by: Biedma, Pablo, et al.
Published: (2024)
Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights
by: Choi, Sooyung, et al.
Published: (2025)
by: Choi, Sooyung, et al.
Published: (2025)
Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference Optimization
by: Duan, Shitong, et al.
Published: (2024)
by: Duan, Shitong, et al.
Published: (2024)
MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
by: Zhu, Yanxu, et al.
Published: (2025)
by: Zhu, Yanxu, et al.
Published: (2025)
On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models
by: Wang, Xinpeng, et al.
Published: (2024)
by: Wang, Xinpeng, et al.
Published: (2024)
Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
by: Lee, Jaehyeok, et al.
Published: (2026)
by: Lee, Jaehyeok, et al.
Published: (2026)
Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language Models
by: Guo, Hanze, et al.
Published: (2025)
by: Guo, Hanze, et al.
Published: (2025)
PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization
by: Jiang, Han, et al.
Published: (2025)
by: Jiang, Han, et al.
Published: (2025)
Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMs
by: Ma, Huanhuan, et al.
Published: (2025)
by: Ma, Huanhuan, et al.
Published: (2025)
Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2026)
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2026)
Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization
by: Wang, Xingqi, et al.
Published: (2024)
by: Wang, Xingqi, et al.
Published: (2024)
MoVa: Towards Generalizable Classification of Human Morals and Values
by: Chen, Ziyu, et al.
Published: (2025)
by: Chen, Ziyu, et al.
Published: (2025)
Research Superalignment Should Advance Now with Alternating Competence and Conformity Optimization
by: Kim, HyunJin, et al.
Published: (2025)
by: Kim, HyunJin, et al.
Published: (2025)
ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs
by: Shen, Hua, et al.
Published: (2024)
by: Shen, Hua, et al.
Published: (2024)
DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes
by: Cheng, Jiehan, et al.
Published: (2025)
by: Cheng, Jiehan, et al.
Published: (2025)
LLM-GLOBE: A Benchmark Evaluating the Cultural Values Embedded in LLM Output
by: Karinshak, Elise, et al.
Published: (2024)
by: Karinshak, Elise, et al.
Published: (2024)
Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities
by: Zhang, Xiangxu, et al.
Published: (2026)
by: Zhang, Xiangxu, et al.
Published: (2026)
AI Evaluation Should Require Standardized Item-Level Data Releases
by: Jiang, Han, et al.
Published: (2026)
by: Jiang, Han, et al.
Published: (2026)
Computational Multi-Agents Society Experiments: Social Modeling Framework Based on Generative Agents
by: Zhang, Hanzhong, et al.
Published: (2025)
by: Zhang, Hanzhong, et al.
Published: (2025)
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
by: Jiang, Han, et al.
Published: (2024)
by: Jiang, Han, et al.
Published: (2024)
Safety Instincts: LLMs Learn to Trust Their Internal Compass for Self-Defense
by: Shen, Guobin, et al.
Published: (2025)
by: Shen, Guobin, et al.
Published: (2025)
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
by: Cao, Maosong, et al.
Published: (2024)
by: Cao, Maosong, et al.
Published: (2024)
OpenCompass: A Universal Evaluation Platform for Large Language Models
by: Cao, Maosong, et al.
Published: (2026)
by: Cao, Maosong, et al.
Published: (2026)
Spatio-Temporal Autoregressions for High Dimensional Matrix-Valued Time Series
by: Dou, Baojun, et al.
Published: (2025)
by: Dou, Baojun, et al.
Published: (2025)
The Transmission Value of Energy Storage and Fundamental Limitations
by: Zhang, Qian, et al.
Published: (2024)
by: Zhang, Qian, et al.
Published: (2024)
SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization
by: Hu, Xixu, et al.
Published: (2024)
by: Hu, Xixu, et al.
Published: (2024)
InFi-Check: Interpretable and Fine-Grained Fact-Checking of LLMs
by: Bai, Yuzhuo, et al.
Published: (2026)
by: Bai, Yuzhuo, et al.
Published: (2026)
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
by: Wang, Jun, et al.
Published: (2025)
by: Wang, Jun, et al.
Published: (2025)
Flames: Benchmarking Value Alignment of LLMs in Chinese
by: Huang, Kexin, et al.
Published: (2023)
by: Huang, Kexin, et al.
Published: (2023)
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
by: Shen, Sicheng, et al.
Published: (2025)
by: Shen, Sicheng, et al.
Published: (2025)
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
by: Bai, Xuehai, et al.
Published: (2026)
by: Bai, Xuehai, et al.
Published: (2026)
Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values
by: Dong, Haonan, et al.
Published: (2026)
by: Dong, Haonan, et al.
Published: (2026)
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
by: Röttger, Paul, et al.
Published: (2024)
by: Röttger, Paul, et al.
Published: (2024)
Similar Items
-
AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference
by: Yao, Jing, et al.
Published: (2025) -
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
by: Bai, Yuzhuo, et al.
Published: (2025) -
Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
by: Duan, Shitong, et al.
Published: (2023) -
CAReDiO: Cultural Alignment via Representativeness and Distinctiveness Guided Data Optimization
by: Yao, Jing, et al.
Published: (2025) -
On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity
by: Huang, Muhua, et al.
Published: (2025)