:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Lee, Wilson Y.
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computation and Language Machine Learning
Online-Zugang:	https://arxiv.org/abs/2601.09084
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments
von: Zhou, Han, et al.
Veröffentlicht: (2024)

Augmenting Human Evaluation with LLM Judges: How Many Human Reviews Do You Need?
von: Kim, Jane Paik
Veröffentlicht: (2026)

Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
von: Kim, Dongyoung, et al.
Veröffentlicht: (2024)

Comparing Human and AI Rater Effects Using the Many-Facet Rasch Model
von: Jiao, Hong, et al.
Veröffentlicht: (2025)

Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
von: Polo, Felipe Maia, et al.
Veröffentlicht: (2025)

Cat, Rat, Meow: On the Alignment of Language Model and Human Term-Similarity Judgments
von: Linhardt, Lorenz, et al.
Veröffentlicht: (2025)

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input
von: Peng, Andi, et al.
Veröffentlicht: (2024)

Out of One, Many: Using Language Models to Simulate Human Samples
von: Argyle, Lisa P., et al.
Veröffentlicht: (2022)

Vibe Checker: Aligning Code Evaluation with Human Preference
von: Zhong, Ming, et al.
Veröffentlicht: (2025)

Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators
von: Liu, Yinhong, et al.
Veröffentlicht: (2024)

Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks
von: Lee, Wilson Y.
Veröffentlicht: (2026)

I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?
von: Liu, Yuhang, et al.
Veröffentlicht: (2025)

Beyond Preferences: Learning Alignment Principles Grounded in Human Reasons and Values
von: Bell, Henry, et al.
Veröffentlicht: (2026)

COPR: Continual Learning Human Preference through Optimal Policy Regularization
von: Zhang, Han, et al.
Veröffentlicht: (2023)

Aligning Black-box Language Models with Human Judgments
von: Burg, Gerrit J. J. van den, et al.
Veröffentlicht: (2025)

ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
von: Cai, Tianchi, et al.
Veröffentlicht: (2023)

DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization
von: Zhou, Zhenglin, et al.
Veröffentlicht: (2025)

Optimizing Language Models for Human Preferences is a Causal Inference Problem
von: Lin, Victoria, et al.
Veröffentlicht: (2024)

Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback
von: Lerner, Emilia Agis, et al.
Veröffentlicht: (2024)

MaxMin-RLHF: Alignment with Diverse Human Preferences
von: Chakraborty, Souradip, et al.
Veröffentlicht: (2024)

Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
von: Jones, Erik, et al.
Veröffentlicht: (2025)

Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference
von: Qiu, Wenjie, et al.
Veröffentlicht: (2025)

Reasoning Capacity in Multi-Agent Systems: Limitations, Challenges and Human-Centered Solutions
von: Pezeshkpour, Pouya, et al.
Veröffentlicht: (2024)

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
von: Wang, Zhilin, et al.
Veröffentlicht: (2025)

DreamReward: Text-to-3D Generation with Human Preference
von: Ye, Junliang, et al.
Veröffentlicht: (2024)

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls
von: Upasani, Shubhangi, et al.
Veröffentlicht: (2026)

Approximating Human Preferences Using a Multi-Judge Learned System
von: Sprejer, Eitán, et al.
Veröffentlicht: (2025)

Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference
von: Gao, Mingqi, et al.
Veröffentlicht: (2024)

Benchmarking LLMs' Judgments with No Gold Standard
von: Xu, Shengwei, et al.
Veröffentlicht: (2024)

Training and Evaluating with Human Label Variation: An Empirical Study
von: Kurniawan, Kemal, et al.
Veröffentlicht: (2025)

Screening Is Enough
von: Nakanishi, Ken M.
Veröffentlicht: (2026)

One Goal, Many Challenges: Robust Preference Optimization Amid Content-Aware and Multi-Source Noise
von: Afzali, Amirabbas, et al.
Veröffentlicht: (2025)

Fact or Guesswork? Evaluating Large Language Models' Medical Knowledge with Structured One-Hop Judgments
von: Li, Jiaxi, et al.
Veröffentlicht: (2025)

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases
von: Hahm, Dongyoon, et al.
Veröffentlicht: (2026)

Evaluating the Unseen Capabilities: How Many Theorems Do LLMs Know?
von: Li, Xiang, et al.
Veröffentlicht: (2025)

Decoding the Ear: A Framework for Objectifying Expressiveness from Human Preference Through Efficient Alignment
von: Lin, Zhiyu, et al.
Veröffentlicht: (2025)

RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
von: Dang, John, et al.
Veröffentlicht: (2024)

Linking In-context Learning in Transformers to Human Episodic Memory
von: Ji-An, Li, et al.
Veröffentlicht: (2024)

Aligning Large Language Models by On-Policy Self-Judgment
von: Lee, Sangkyu, et al.
Veröffentlicht: (2024)

Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
von: Poddar, Sriyash, et al.
Veröffentlicht: (2024)