Saved in:
| Main Authors: | Wang, Yuhang, Zhu, Yanxu, Kong, Chao, Wei, Shuyu, Yi, Xiaoyuan, Xie, Xing, Sang, Jitao |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2311.16421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning
by: Sang, Jitao, et al.
Published: (2024)
by: Sang, Jitao, et al.
Published: (2024)
Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines
by: Wang, Yuhang, et al.
Published: (2025)
by: Wang, Yuhang, et al.
Published: (2025)
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
by: Zhu, Yanxu, et al.
Published: (2024)
by: Zhu, Yanxu, et al.
Published: (2024)
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
by: Jiang, Han, et al.
Published: (2024)
by: Jiang, Han, et al.
Published: (2024)
Don't Command, Cultivate: An Exploratory Study of System-2 Alignment
by: Wang, Yuhang, et al.
Published: (2024)
by: Wang, Yuhang, et al.
Published: (2024)
Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
by: Duan, Shitong, et al.
Published: (2023)
by: Duan, Shitong, et al.
Published: (2023)
PICACO: Pluralistic In-Context Value Alignment of LLMs via Total Correlation Optimization
by: Jiang, Han, et al.
Published: (2025)
by: Jiang, Han, et al.
Published: (2025)
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions
by: Masoud, Reem I., et al.
Published: (2023)
by: Masoud, Reem I., et al.
Published: (2023)
IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective Optimization
by: Bai, Yuzhuo, et al.
Published: (2025)
by: Bai, Yuzhuo, et al.
Published: (2025)
Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook
by: Lee, Jaehyeok, et al.
Published: (2026)
by: Lee, Jaehyeok, et al.
Published: (2026)
Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms
by: Wang, Yuhang, et al.
Published: (2025)
by: Wang, Yuhang, et al.
Published: (2025)
The Incomplete Bridge: How AI Research (Mis)Engages with Psychology
by: Jiang, Han, et al.
Published: (2025)
by: Jiang, Han, et al.
Published: (2025)
AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference
by: Yao, Jing, et al.
Published: (2025)
by: Yao, Jing, et al.
Published: (2025)
Measuring Large Language Models Capacity to Annotate Journalistic Sourcing
by: Vincent, Subramaniam, et al.
Published: (2024)
by: Vincent, Subramaniam, et al.
Published: (2024)
Investigating Cultural Alignment of Large Language Models
by: AlKhamissi, Badr, et al.
Published: (2024)
by: AlKhamissi, Badr, et al.
Published: (2024)
On Classification with Large Language Models in Cultural Analytics
by: Bamman, David, et al.
Published: (2024)
by: Bamman, David, et al.
Published: (2024)
WorldValuesBench: A Large-Scale Benchmark Dataset for Multi-Cultural Value Awareness of Language Models
by: Zhao, Wenlong, et al.
Published: (2024)
by: Zhao, Wenlong, et al.
Published: (2024)
AuditWen:An Open-Source Large Language Model for Audit
by: Huang, Jiajia, et al.
Published: (2024)
by: Huang, Jiajia, et al.
Published: (2024)
Extrinsic Evaluation of Cultural Competence in Large Language Models
by: Bhatt, Shaily, et al.
Published: (2024)
by: Bhatt, Shaily, et al.
Published: (2024)
MoVa: Towards Generalizable Classification of Human Morals and Values
by: Chen, Ziyu, et al.
Published: (2025)
by: Chen, Ziyu, et al.
Published: (2025)
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
by: Bai, Xuechunzi, et al.
Published: (2024)
by: Bai, Xuechunzi, et al.
Published: (2024)
Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights
by: Choi, Sooyung, et al.
Published: (2025)
by: Choi, Sooyung, et al.
Published: (2025)
WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models
by: Mushtaq, Abdullah, et al.
Published: (2025)
by: Mushtaq, Abdullah, et al.
Published: (2025)
Benchmarking Political Persuasion Risks Across Frontier Large Language Models
by: Chen, Zhongren, et al.
Published: (2026)
by: Chen, Zhongren, et al.
Published: (2026)
Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora
by: Derner, Erik, et al.
Published: (2024)
by: Derner, Erik, et al.
Published: (2024)
Climate Change from Large Language Models
by: Zhu, Hongyin, et al.
Published: (2023)
by: Zhu, Hongyin, et al.
Published: (2023)
ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models
by: Tu, Yahan, et al.
Published: (2024)
by: Tu, Yahan, et al.
Published: (2024)
MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions
by: Zhu, Yanxu, et al.
Published: (2025)
by: Zhu, Yanxu, et al.
Published: (2025)
Urban Computing in the Era of Large Language Models
by: Li, Zhonghang, et al.
Published: (2025)
by: Li, Zhonghang, et al.
Published: (2025)
Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese
by: Lyu, Hanjia, et al.
Published: (2025)
by: Lyu, Hanjia, et al.
Published: (2025)
Nunchi-Bench: Benchmarking Language Models on Cultural Reasoning with a Focus on Korean Superstition
by: Kim, Kyuhee, et al.
Published: (2025)
by: Kim, Kyuhee, et al.
Published: (2025)
Measuring Human Contribution in AI-Assisted Content Generation
by: Xie, Yueqi, et al.
Published: (2024)
by: Xie, Yueqi, et al.
Published: (2024)
VideoNorms: Benchmarking Cultural Awareness of Video Language Models
by: Varimalla, Nikhil Reddy, et al.
Published: (2025)
by: Varimalla, Nikhil Reddy, et al.
Published: (2025)
Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study
by: Xu, Liuchang, et al.
Published: (2024)
by: Xu, Liuchang, et al.
Published: (2024)
WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models
by: Felkner, Virginia K., et al.
Published: (2023)
by: Felkner, Virginia K., et al.
Published: (2023)
Human-Level and Beyond: Benchmarking Large Language Models Against Clinical Pharmacists in Prescription Review
by: Yang, Yan, et al.
Published: (2025)
by: Yang, Yan, et al.
Published: (2025)
AccessEval: Benchmarking Disability Bias in Large Language Models
by: Panda, Srikant, et al.
Published: (2025)
by: Panda, Srikant, et al.
Published: (2025)
DarkBench: Benchmarking Dark Patterns in Large Language Models
by: Kran, Esben, et al.
Published: (2025)
by: Kran, Esben, et al.
Published: (2025)
Invisible Filters: Cultural Bias in Hiring Evaluations Using Large Language Models
by: Rao, Pooja S. B., et al.
Published: (2025)
by: Rao, Pooja S. B., et al.
Published: (2025)
Training-Free Cultural Alignment of Large Language Models via Persona Disagreement
by: Kiet, Huynh Trung, et al.
Published: (2026)
by: Kiet, Huynh Trung, et al.
Published: (2026)
Similar Items
-
Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning
by: Sang, Jitao, et al.
Published: (2024) -
Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines
by: Wang, Yuhang, et al.
Published: (2025) -
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
by: Zhu, Yanxu, et al.
Published: (2024) -
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
by: Jiang, Han, et al.
Published: (2024) -
Don't Command, Cultivate: An Exploratory Study of System-2 Alignment
by: Wang, Yuhang, et al.
Published: (2024)