:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gwon, Hansle, Ahn, Imjin, Kim, Young-Hak, Park, Sanghyun, Jun, Tae Joon
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2412.07812
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization
by: Ahn, Imjin, et al.
Published: (2024)

InMD-X: Large Language Models for Internal Medicine Doctors
by: Gwon, Hansle, et al.
Published: (2024)

Enhancing Clinical Efficiency through LLM: Discharge Note Generation for Cardiac Patients
by: Jung, HyoJe, et al.
Published: (2024)

DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models
by: Jung, Sunghee, et al.
Published: (2025)

Are You Sure? Rank Them Again: Repeated Ranking For Better Preference Datasets
by: Devine, Peter
Published: (2024)

LiPO: Listwise Preference Optimization through Learning-to-Rank
by: Liu, Tianqi, et al.
Published: (2024)

VPO: Leveraging the Number of Votes in Preference Optimization
by: Cho, Jae Hyeon, et al.
Published: (2024)

DiscoverLLM: From Executing Intents to Discovering Them
by: Kim, Tae Soo, et al.
Published: (2026)

Preference Optimization with Multi-Sample Comparisons
by: Wang, Chaoqi, et al.
Published: (2024)

Disentangling Length from Quality in Direct Preference Optimization
by: Park, Ryan, et al.
Published: (2024)

Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment
by: Yin, Yueqin, et al.
Published: (2024)

Improving Socratic Question Generation using Data Augmentation and Preference Optimization
by: Kumar, Nischal Ashok, et al.
Published: (2024)

When Do "More Contexts" Help with Sarcasm Recognition?
by: Nimase, Ojas, et al.
Published: (2024)

Length Desensitization in Direct Preference Optimization
by: Liu, Wei, et al.
Published: (2024)

BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization
by: Lee, Gihun, et al.
Published: (2024)

Preference Learning Algorithms Do Not Learn Preference Rankings
by: Chen, Angelica, et al.
Published: (2024)

RankPO: Preference Optimization for Job-Talent Matching
by: Zhang, Yafei, et al.
Published: (2025)

Direct Multi-Turn Preference Optimization for Language Agents
by: Shi, Wentao, et al.
Published: (2024)

Multi-Reference Preference Optimization for Large Language Models
by: Le, Hung, et al.
Published: (2024)

AMPO: Active Multi-Preference Optimization for Self-play Preference Selection
by: Gupta, Taneesh, et al.
Published: (2025)

Ruling Out to Rule In: Contrastive Hypothesis Retrieval for Medical Question Answering
by: Kim, Byeolhee, et al.
Published: (2026)

Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison
by: Shen, Judy Hanwen, et al.
Published: (2024)

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
by: Agostinelli, Victor, et al.
Published: (2024)

References Indeed Matter? Reference-Free Preference Optimization for Conversational Query Reformulation
by: Kim, Doyoung, et al.
Published: (2025)

Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation
by: Yoo, YoungJoon, et al.
Published: (2023)

FlowBot: Inducing LLM Workflows with Bilevel Optimization and Textual Gradients
by: Yu, Hongyeon, et al.
Published: (2026)

DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization
by: Huang, Chengyu, et al.
Published: (2025)

A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications
by: Xiao, Wenyi, et al.
Published: (2024)

Preference Optimization by Estimating the Ratio of the Data Distribution
by: Kim, Yeongmin, et al.
Published: (2025)

On the Role of Preference Variance in Preference Optimization
by: Guo, Jiacheng, et al.
Published: (2025)

PORT: Preference Optimization on Reasoning Traces
by: Lahlou, Salem, et al.
Published: (2024)

LifeAlign: Lifelong Alignment for Large Language Models with Memory-Augmented Focalized Preference Optimization
by: Li, Junsong, et al.
Published: (2025)

Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation
by: Kim, Minkyoung, et al.
Published: (2024)

Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning
by: Kim, Myoungjun, et al.
Published: (2026)

Unsupervised Text Embedding Space Generation Using Generative Adversarial Networks for Text Synthesis
by: Lee, Jun-Min, et al.
Published: (2023)

OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
by: Park, Jeongkyun, et al.
Published: (2023)

Understanding Reference Policies in Direct Preference Optimization
by: Liu, Yixin, et al.
Published: (2024)

Accelerating Direct Preference Optimization with Prefix Sharing
by: Wang, Franklin, et al.
Published: (2024)

FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration
by: Jo, Dongwon, et al.
Published: (2025)

Importance Sampling for Multi-Negative Multimodal Direct Preference Optimization
by: Li, Xintong, et al.
Published: (2025)