Saved in:
| Main Authors: | Hu, Xiangkun, He, Tong, Wipf, David |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.09072 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Explicit Preference Optimization: No Need for an Implicit Reward Model
by: Hu, Xiangkun, et al.
Published: (2025)
by: Hu, Xiangkun, et al.
Published: (2025)
Desiderata for the Context Use of Question Answering Systems
by: Shaier, Sagi, et al.
Published: (2024)
by: Shaier, Sagi, et al.
Published: (2024)
Not All Subjectivity Is the Same! Defining Desiderata for the Evaluation of Subjectivity in NLP
by: Khurana, Urja, et al.
Published: (2026)
by: Khurana, Urja, et al.
Published: (2026)
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
by: Gunjal, Anisha, et al.
Published: (2024)
by: Gunjal, Anisha, et al.
Published: (2024)
Direct Judgement Preference Optimization
by: Wang, Peifeng, et al.
Published: (2024)
by: Wang, Peifeng, et al.
Published: (2024)
BPO: Revisiting Preference Modeling in Direct Preference Optimization
by: Sun, Lin, et al.
Published: (2025)
by: Sun, Lin, et al.
Published: (2025)
Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization
by: Li, Xintong, et al.
Published: (2025)
by: Li, Xintong, et al.
Published: (2025)
On Extending Direct Preference Optimization to Accommodate Ties
by: Chen, Jinghong, et al.
Published: (2024)
by: Chen, Jinghong, et al.
Published: (2024)
Token-weighted Direct Preference Optimization with Attention
by: Huang, Chengyu, et al.
Published: (2026)
by: Huang, Chengyu, et al.
Published: (2026)
Length Desensitization in Direct Preference Optimization
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
Token-level Direct Preference Optimization
by: Zeng, Yongcheng, et al.
Published: (2024)
by: Zeng, Yongcheng, et al.
Published: (2024)
Filtered Direct Preference Optimization
by: Morimura, Tetsuro, et al.
Published: (2024)
by: Morimura, Tetsuro, et al.
Published: (2024)
Direct Preference Optimization with an Offset
by: Amini, Afra, et al.
Published: (2024)
by: Amini, Afra, et al.
Published: (2024)
DPO-Shift: Shifting the Distribution of Direct Preference Optimization
by: Yang, Xiliang, et al.
Published: (2025)
by: Yang, Xiliang, et al.
Published: (2025)
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
by: Lin, Yong, et al.
Published: (2024)
by: Lin, Yong, et al.
Published: (2024)
Ambiguity Awareness Optimization: Towards Semantic Disambiguation for Direct Preference Optimization
by: Li, Jian, et al.
Published: (2025)
by: Li, Jian, et al.
Published: (2025)
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
by: Zhao, Zhiyuan, et al.
Published: (2023)
by: Zhao, Zhiyuan, et al.
Published: (2023)
Understanding Reference Policies in Direct Preference Optimization
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
Accelerating Direct Preference Optimization with Prefix Sharing
by: Wang, Franklin, et al.
Published: (2024)
by: Wang, Franklin, et al.
Published: (2024)
Entropy Controllable Direct Preference Optimization
by: Omura, Motoki, et al.
Published: (2024)
by: Omura, Motoki, et al.
Published: (2024)
Orthogonal Finetuning for Direct Preference Optimization
by: Yang, Chenxu, et al.
Published: (2024)
by: Yang, Chenxu, et al.
Published: (2024)
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
by: Lu, Junru, et al.
Published: (2024)
by: Lu, Junru, et al.
Published: (2024)
AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization
by: Wu, Junkang, et al.
Published: (2024)
by: Wu, Junkang, et al.
Published: (2024)
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
by: Jiang, Yuxin, et al.
Published: (2024)
by: Jiang, Yuxin, et al.
Published: (2024)
PHOENIX: Open-Source Language Adaption for Direct Preference Optimization
by: Uhlig, Matthias, et al.
Published: (2024)
by: Uhlig, Matthias, et al.
Published: (2024)
Backtranslation Augmented Direct Preference Optimization for Neural Machine Translation
by: Ghassabi, Mehrdad, et al.
Published: (2026)
by: Ghassabi, Mehrdad, et al.
Published: (2026)
DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization
by: Deng, Mengyi, et al.
Published: (2026)
by: Deng, Mengyi, et al.
Published: (2026)
Is On-Policy Data always the Best Choice for Direct Preference Optimization-based LM Alignment?
by: Sun, Zetian, et al.
Published: (2025)
by: Sun, Zetian, et al.
Published: (2025)
Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs
by: Quang, Trung Nguyen, et al.
Published: (2026)
by: Quang, Trung Nguyen, et al.
Published: (2026)
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
by: Liu, Tong, et al.
Published: (2025)
by: Liu, Tong, et al.
Published: (2025)
FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment
by: Zhu, Kewen, et al.
Published: (2026)
by: Zhu, Kewen, et al.
Published: (2026)
Direct Multi-Turn Preference Optimization for Language Agents
by: Shi, Wentao, et al.
Published: (2024)
by: Shi, Wentao, et al.
Published: (2024)
The Crucial Role of Samplers in Online Direct Preference Optimization
by: Shi, Ruizhe, et al.
Published: (2024)
by: Shi, Ruizhe, et al.
Published: (2024)
Disentangling Length from Quality in Direct Preference Optimization
by: Park, Ryan, et al.
Published: (2024)
by: Park, Ryan, et al.
Published: (2024)
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
by: Li, Shilong, et al.
Published: (2024)
by: Li, Shilong, et al.
Published: (2024)
No Preference Left Behind: Group Distributional Preference Optimization
by: Yao, Binwei, et al.
Published: (2024)
by: Yao, Binwei, et al.
Published: (2024)
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
by: Wang, Tianduo, et al.
Published: (2024)
by: Wang, Tianduo, et al.
Published: (2024)
GroupDPO: Memory efficient Group-wise Direct Preference Optimization
by: Leng, Jixuan, et al.
Published: (2026)
by: Leng, Jixuan, et al.
Published: (2026)
Iterative Reasoning Preference Optimization
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
by: Pang, Richard Yuanzhe, et al.
Published: (2024)
Random Direct Preference Optimization for Radiography Report Generation
by: Samokhin, Valentin, et al.
Published: (2025)
by: Samokhin, Valentin, et al.
Published: (2025)
Similar Items
-
Explicit Preference Optimization: No Need for an Implicit Reward Model
by: Hu, Xiangkun, et al.
Published: (2025) -
Desiderata for the Context Use of Question Answering Systems
by: Shaier, Sagi, et al.
Published: (2024) -
Not All Subjectivity Is the Same! Defining Desiderata for the Evaluation of Subjectivity in NLP
by: Khurana, Urja, et al.
Published: (2026) -
Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
by: Gunjal, Anisha, et al.
Published: (2024) -
Direct Judgement Preference Optimization
by: Wang, Peifeng, et al.
Published: (2024)