Saved in:
| Main Authors: | Liu, Tong, Yu, Xiao, Zhou, Wenxuan, Gu, Jindong, Tresp, Volker |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.06645 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RankPO: Preference Optimization for Job-Talent Matching
by: Zhang, Yafei, et al.
Published: (2025)
by: Zhang, Yafei, et al.
Published: (2025)
Preference Ranking Optimization for Human Alignment
by: Song, Feifan, et al.
Published: (2023)
by: Song, Feifan, et al.
Published: (2023)
WPO: Enhancing RLHF with Weighted Preference Optimization
by: Zhou, Wenxuan, et al.
Published: (2024)
by: Zhou, Wenxuan, et al.
Published: (2024)
PerPO: Perceptual Preference Optimization via Discriminative Rewarding
by: Zhu, Zining, et al.
Published: (2025)
by: Zhu, Zining, et al.
Published: (2025)
T-REG: Preference Optimization with Token-Level Reward Regularization
by: Zhou, Wenxuan, et al.
Published: (2024)
by: Zhou, Wenxuan, et al.
Published: (2024)
Teaching Your Models to Understand Code via Focal Preference Alignment
by: Wu, Jie, et al.
Published: (2025)
by: Wu, Jie, et al.
Published: (2025)
LifeAlign: Lifelong Alignment for Large Language Models with Memory-Augmented Focalized Preference Optimization
by: Li, Junsong, et al.
Published: (2025)
by: Li, Junsong, et al.
Published: (2025)
KnowPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models
by: Zhang, Ruizhe, et al.
Published: (2024)
by: Zhang, Ruizhe, et al.
Published: (2024)
LRHP: Learning Representations for Human Preferences via Preference Pairs
by: Wang, Chenglong, et al.
Published: (2024)
by: Wang, Chenglong, et al.
Published: (2024)
Geometric-Averaged Preference Optimization for Soft Preference Labels
by: Furuta, Hiroki, et al.
Published: (2024)
by: Furuta, Hiroki, et al.
Published: (2024)
Course-Correction: Safety Alignment Using Synthetic Preferences
by: Xu, Rongwu, et al.
Published: (2024)
by: Xu, Rongwu, et al.
Published: (2024)
Preference Learning Algorithms Do Not Learn Preference Rankings
by: Chen, Angelica, et al.
Published: (2024)
by: Chen, Angelica, et al.
Published: (2024)
SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization
by: Sun, Huashan, et al.
Published: (2025)
by: Sun, Huashan, et al.
Published: (2025)
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
by: Li, Jian, et al.
Published: (2024)
by: Li, Jian, et al.
Published: (2024)
Visual Question Decomposition on Multimodal Large Language Models
by: Zhang, Haowei, et al.
Published: (2024)
by: Zhang, Haowei, et al.
Published: (2024)
Multiplayer Nash Preference Optimization
by: Wu, Fang, et al.
Published: (2025)
by: Wu, Fang, et al.
Published: (2025)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
by: Zhang, Wenxuan, et al.
Published: (2024)
by: Zhang, Wenxuan, et al.
Published: (2024)
Stable Preference Optimization: A Bilevel Approach to Catastrophic Preference Shift
by: Jian, Chengtao, et al.
Published: (2025)
by: Jian, Chengtao, et al.
Published: (2025)
Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization
by: Wang, Yilong, et al.
Published: (2026)
by: Wang, Yilong, et al.
Published: (2026)
Decoupling Strategy and Execution in Task-Focused Dialogue via Goal-Oriented Preference Optimization
by: Xu, Jingyi, et al.
Published: (2026)
by: Xu, Jingyi, et al.
Published: (2026)
ComPO: Preference Alignment via Comparison Oracles
by: Chen, Peter, et al.
Published: (2025)
by: Chen, Peter, et al.
Published: (2025)
Orthogonal Finetuning for Direct Preference Optimization
by: Yang, Chenxu, et al.
Published: (2024)
by: Yang, Chenxu, et al.
Published: (2024)
Creative Preference Optimization
by: Ismayilzada, Mete, et al.
Published: (2025)
by: Ismayilzada, Mete, et al.
Published: (2025)
On the Role of Preference Variance in Preference Optimization
by: Guo, Jiacheng, et al.
Published: (2025)
by: Guo, Jiacheng, et al.
Published: (2025)
ConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Preference Optimization
by: Yoon, Hee Suk, et al.
Published: (2025)
by: Yoon, Hee Suk, et al.
Published: (2025)
Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
by: Ma, Xilai, et al.
Published: (2026)
by: Ma, Xilai, et al.
Published: (2026)
FedPop: Federated Population-based Hyperparameter Tuning
by: Chen, Haokun, et al.
Published: (2023)
by: Chen, Haokun, et al.
Published: (2023)
CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences
by: Pokharel, Rhitabrat, et al.
Published: (2025)
by: Pokharel, Rhitabrat, et al.
Published: (2025)
Preference Packing: Efficient Preference Optimization for Large Language Models
by: Cho, Jaekyung
Published: (2026)
by: Cho, Jaekyung
Published: (2026)
Does Machine Unlearning Truly Remove Knowledge?
by: Chen, Haokun, et al.
Published: (2025)
by: Chen, Haokun, et al.
Published: (2025)
Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
by: Gu, Yanggan, et al.
Published: (2025)
by: Gu, Yanggan, et al.
Published: (2025)
Token-level Direct Preference Optimization
by: Zeng, Yongcheng, et al.
Published: (2024)
by: Zeng, Yongcheng, et al.
Published: (2024)
Weights-Rotated Preference Optimization for Large Language Models
by: Yang, Chenxu, et al.
Published: (2025)
by: Yang, Chenxu, et al.
Published: (2025)
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
by: Yin, Yueqin, et al.
Published: (2024)
by: Yin, Yueqin, et al.
Published: (2024)
Accelerated Preference Optimization for Large Language Model Alignment
by: He, Jiafan, et al.
Published: (2024)
by: He, Jiafan, et al.
Published: (2024)
Provably Better Explanations with Optimized Aggregation of Feature Attributions
by: Decker, Thomas, et al.
Published: (2024)
by: Decker, Thomas, et al.
Published: (2024)
Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences
by: Pattnaik, Pulkit, et al.
Published: (2024)
by: Pattnaik, Pulkit, et al.
Published: (2024)
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
Chunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference Optimization
by: Duan, Shaohua, et al.
Published: (2025)
by: Duan, Shaohua, et al.
Published: (2025)
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
Similar Items
-
RankPO: Preference Optimization for Job-Talent Matching
by: Zhang, Yafei, et al.
Published: (2025) -
Preference Ranking Optimization for Human Alignment
by: Song, Feifan, et al.
Published: (2023) -
WPO: Enhancing RLHF with Weighted Preference Optimization
by: Zhou, Wenxuan, et al.
Published: (2024) -
PerPO: Perceptual Preference Optimization via Discriminative Rewarding
by: Zhu, Zining, et al.
Published: (2025) -
T-REG: Preference Optimization with Token-Level Reward Regularization
by: Zhou, Wenxuan, et al.
Published: (2024)