:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Karnysheva, Anna, Drescher, Christian, Klakow, Dietrich
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.15719
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Semantic Understanding in Speech Language Models via Brain-tuning
by: Moussa, Omer, et al.
Published: (2024)

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
by: Wang, Haoxiang, et al.
Published: (2024)

PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks
by: Liu, Yunuo, et al.
Published: (2025)

Chemical Language Models for Natural Products: A State-Space Model Approach
by: Wang, Ho-Hsuan, et al.
Published: (2026)

Unplugging a Seemingly Sentient Machine Is the Rational Choice -- A Metaphysical Perspective
by: Bekkers, Erik J, et al.
Published: (2026)

Sparks of Rationality: Do Reasoning LLMs Align with Human Judgment and Choice?
by: Tak, Ala N., et al.
Published: (2026)

Routing, Cascades, and User Choice for LLMs
by: Mahmood, Rafid
Published: (2026)

Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time
by: Chehade, Mohamad, et al.
Published: (2025)

Classical AI vs. LLMs for Decision-Maker Alignment in Health Insurance Choices
by: Mainali, Mallika, et al.
Published: (2025)

Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?
by: Sun, Zetian, et al.
Published: (2025)

From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
by: Li, Jia-Nan, et al.
Published: (2025)

Learning a Reward Function for User-Preferred Appliance Scheduling
by: Čović, Nikolina, et al.
Published: (2023)

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
by: Zhang, Shenao, et al.
Published: (2024)

Sample Efficient Preference Alignment in LLMs via Active Exploration
by: Mehta, Viraj, et al.
Published: (2023)

Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
by: Zhang, Yichi, et al.
Published: (2023)

Federated Variational Preference Alignment with Gumbel-Softmax Prior for Personalized User Preferences
by: Koo, Jabin, et al.
Published: (2026)

APPA: Adaptive Preference Pluralistic Alignment for Fair Federated RLHF of LLMs
by: Srewa, Mahmoud, et al.
Published: (2026)

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger
by: Afzali, Amirabbas, et al.
Published: (2026)

MALLES: A Multi-agent LLMs-based Economic Sandbox with Consumer Preference Alignment
by: Wu, Yusen, et al.
Published: (2026)

Transformers for molecular property prediction: Domain adaptation efficiently improves performance
by: Sultan, Afnan, et al.
Published: (2025)

HPS: Hard Preference Sampling for Human Preference Alignment
by: Zou, Xiandong, et al.
Published: (2025)

Beyond Preferences in AI Alignment
by: Zhi-Xuan, Tan, et al.
Published: (2024)

The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis
by: Zhang, Miaoran, et al.
Published: (2024)

A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs
by: Srewa, Mahmoud, et al.
Published: (2025)

GEM: Generative Entropy-Guided Preference Modeling for Few-shot Alignment of LLMs
by: Zhao, Yiyang, et al.
Published: (2025)

Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment
by: Xu, Wenzhe, et al.
Published: (2026)

Strong Preferences Affect the Robustness of Preference Models and Value Alignment
by: Xu, Ziwei, et al.
Published: (2024)

Resource Rational Contractualism Should Guide AI Alignment
by: Levine, Sydney, et al.
Published: (2025)

PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
by: Ji, Jiaming, et al.
Published: (2024)

Temporal User Profiling with LLMs: Balancing Short-Term and Long-Term Preferences for Recommendations
by: Sabouri, Milad, et al.
Published: (2025)

What explains the success of cross-modal fine-tuning with ORCA?
by: García-de-Herreros, Paloma, et al.
Published: (2024)

Unified Preference Optimization: Language Model Alignment Beyond the Preference Frontier
by: Badrinath, Anirudhan, et al.
Published: (2024)

Understanding User Preferences in Explainable Artificial Intelligence: A Survey and a Mapping Function Proposal
by: Hashemi, Maryam, et al.
Published: (2023)

EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)

Direct Alignment with Heterogeneous Preferences
by: Shirali, Ali, et al.
Published: (2025)

SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
by: Christopoulou, Fenia, et al.
Published: (2024)

Who Laughs with Whom? Disentangling Influential Factors in Humor Preferences across User Clusters and LLMs
by: Murakami, Soichiro, et al.
Published: (2026)

Implicit Safety Alignment from Crowd Preferences
by: Lin, Qian, et al.
Published: (2026)

On Diversified Preferences of Large Language Model Alignment
by: Zeng, Dun, et al.
Published: (2023)

The Sign Estimator: LLM Alignment in the Face of Choice Heterogeneity
by: Aouad, Ali, et al.
Published: (2025)