Saved in:
| Main Authors: | Li, Jian, Yin, Shenglin, Zhang, Yujia, Zhao, Alan, Chen, Xi, Zhou, Xiaohui, Xu, Pengfei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.23391 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
by: Li, Jian, et al.
Published: (2024)
by: Li, Jian, et al.
Published: (2024)
Micro-Macro Retrieval: Reducing Long-Form Hallucination in Large Language Models
by: Feng, Yujie, et al.
Published: (2026)
by: Feng, Yujie, et al.
Published: (2026)
Direct Judgement Preference Optimization
by: Wang, Peifeng, et al.
Published: (2024)
by: Wang, Peifeng, et al.
Published: (2024)
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
by: Yin, Yueqin, et al.
Published: (2024)
by: Yin, Yueqin, et al.
Published: (2024)
Understanding Reference Policies in Direct Preference Optimization
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
Online DPO: Online Direct Preference Optimization with Fast-Slow Chasing
by: Qi, Biqing, et al.
Published: (2024)
by: Qi, Biqing, et al.
Published: (2024)
Disambiguate First, Parse Later: Generating Interpretations for Ambiguity Resolution in Semantic Parsing
by: Saparina, Irina, et al.
Published: (2025)
by: Saparina, Irina, et al.
Published: (2025)
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
by: Gu, Yi, et al.
Published: (2024)
by: Gu, Yi, et al.
Published: (2024)
DPO-Shift: Shifting the Distribution of Direct Preference Optimization
by: Yang, Xiliang, et al.
Published: (2025)
by: Yang, Xiliang, et al.
Published: (2025)
A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation
by: Abeysiriwardana, Miuru, et al.
Published: (2024)
by: Abeysiriwardana, Miuru, et al.
Published: (2024)
Orthogonal Finetuning for Direct Preference Optimization
by: Yang, Chenxu, et al.
Published: (2024)
by: Yang, Chenxu, et al.
Published: (2024)
BPO: Revisiting Preference Modeling in Direct Preference Optimization
by: Sun, Lin, et al.
Published: (2025)
by: Sun, Lin, et al.
Published: (2025)
AIMMerging: Adaptive Iterative Model Merging Using Training Trajectories for Language Model Continual Learning
by: Feng, Yujie, et al.
Published: (2025)
by: Feng, Yujie, et al.
Published: (2025)
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
by: Zhao, Zhiyuan, et al.
Published: (2023)
by: Zhao, Zhiyuan, et al.
Published: (2023)
TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
by: Zhu, Mingkang, et al.
Published: (2025)
by: Zhu, Mingkang, et al.
Published: (2025)
Towards Harmless Multimodal Assistants with Blind Preference Optimization
by: Li, Yongqi, et al.
Published: (2025)
by: Li, Yongqi, et al.
Published: (2025)
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
by: Lu, Junru, et al.
Published: (2024)
by: Lu, Junru, et al.
Published: (2024)
Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models
by: Zhang, Huatian, et al.
Published: (2026)
by: Zhang, Huatian, et al.
Published: (2026)
Length Desensitization in Direct Preference Optimization
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
On Extending Direct Preference Optimization to Accommodate Ties
by: Chen, Jinghong, et al.
Published: (2024)
by: Chen, Jinghong, et al.
Published: (2024)
Token-weighted Direct Preference Optimization with Attention
by: Huang, Chengyu, et al.
Published: (2026)
by: Huang, Chengyu, et al.
Published: (2026)
Token-level Direct Preference Optimization
by: Zeng, Yongcheng, et al.
Published: (2024)
by: Zeng, Yongcheng, et al.
Published: (2024)
New Desiderata for Direct Preference Optimization
by: Hu, Xiangkun, et al.
Published: (2024)
by: Hu, Xiangkun, et al.
Published: (2024)
Quantum Visual Word Sense Disambiguation: Unraveling Ambiguities Through Quantum Inference Model
by: Qiao, Wenbo, et al.
Published: (2025)
by: Qiao, Wenbo, et al.
Published: (2025)
The Crucial Role of Samplers in Online Direct Preference Optimization
by: Shi, Ruizhe, et al.
Published: (2024)
by: Shi, Ruizhe, et al.
Published: (2024)
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
by: Zhang, Zefeng, et al.
Published: (2025)
by: Zhang, Zefeng, et al.
Published: (2025)
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
by: Liu, Hanbing, et al.
Published: (2025)
by: Liu, Hanbing, et al.
Published: (2025)
VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization
by: Liu, Weixin, et al.
Published: (2026)
by: Liu, Weixin, et al.
Published: (2026)
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
by: Das, Amitava, et al.
Published: (2025)
by: Das, Amitava, et al.
Published: (2025)
Filtered Direct Preference Optimization
by: Morimura, Tetsuro, et al.
Published: (2024)
by: Morimura, Tetsuro, et al.
Published: (2024)
Direct Preference Optimization with an Offset
by: Amini, Afra, et al.
Published: (2024)
by: Amini, Afra, et al.
Published: (2024)
SDPO: Segment-Level Direct Preference Optimization for Social Agents
by: Kong, Aobo, et al.
Published: (2025)
by: Kong, Aobo, et al.
Published: (2025)
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment
by: Wang, Sizhe, et al.
Published: (2024)
by: Wang, Sizhe, et al.
Published: (2024)
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
by: Shao, Ruichen, et al.
Published: (2025)
by: Shao, Ruichen, et al.
Published: (2025)
DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization
by: Deng, Mengyi, et al.
Published: (2026)
by: Deng, Mengyi, et al.
Published: (2026)
A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications
by: Xiao, Wenyi, et al.
Published: (2024)
by: Xiao, Wenyi, et al.
Published: (2024)
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
by: Wu, Junkang, et al.
Published: (2024)
by: Wu, Junkang, et al.
Published: (2024)
RankPO: Preference Optimization for Job-Talent Matching
by: Zhang, Yafei, et al.
Published: (2025)
by: Zhang, Yafei, et al.
Published: (2025)
DEPO: Dual-Efficiency Preference Optimization for LLM Agents
by: Chen, Sirui, et al.
Published: (2025)
by: Chen, Sirui, et al.
Published: (2025)
Bridging Lexical Ambiguity and Vision: A Mini Review on Visual Word Sense Disambiguation
by: Nilukshi, Shashini, et al.
Published: (2026)
by: Nilukshi, Shashini, et al.
Published: (2026)
Similar Items
-
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
by: Li, Jian, et al.
Published: (2024) -
Micro-Macro Retrieval: Reducing Long-Form Hallucination in Large Language Models
by: Feng, Yujie, et al.
Published: (2026) -
Direct Judgement Preference Optimization
by: Wang, Peifeng, et al.
Published: (2024) -
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
by: Yin, Yueqin, et al.
Published: (2024) -
Understanding Reference Policies in Direct Preference Optimization
by: Liu, Yixin, et al.
Published: (2024)