Saved in:
| Main Authors: | Ahn, Daechul, Choi, Yura, Kim, San, Yu, Youngjae, Kang, Dongyeop, Choi, Jonghyun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.11280 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
by: Ahn, Daechul, et al.
Published: (2024)
by: Ahn, Daechul, et al.
Published: (2024)
What Happens When: Learning Temporal Orders of Events in Videos
by: Ahn, Daechul, et al.
Published: (2025)
by: Ahn, Daechul, et al.
Published: (2025)
AVC-DPO: Aligned Video Captioning via Direct Preference Optimization
by: Tang, Jiyang, et al.
Published: (2025)
by: Tang, Jiyang, et al.
Published: (2025)
PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation
by: Huang, Qihan, et al.
Published: (2024)
by: Huang, Qihan, et al.
Published: (2024)
VL-DPO: Vision-Language-Guided Finetuning for Preference-Aligned Autonomous Driving
by: Xu, Zhefan, et al.
Published: (2026)
by: Xu, Zhefan, et al.
Published: (2026)
Becoming Experienced Judges: Selective Test-Time Learning for Evaluators
by: Jwa, Seungyeon, et al.
Published: (2025)
by: Jwa, Seungyeon, et al.
Published: (2025)
DriveDPO: Policy Learning via Safety DPO For End-to-End Autonomous Driving
by: Shang, Shuyao, et al.
Published: (2025)
by: Shang, Shuyao, et al.
Published: (2025)
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
Attention Misses Visual Risk: Risk-Adaptive Steering for Multimodal Safety Alignment
by: Park, Jonghyun, et al.
Published: (2025)
by: Park, Jonghyun, et al.
Published: (2025)
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO
by: Zhang, Lingfan, et al.
Published: (2025)
by: Zhang, Lingfan, et al.
Published: (2025)
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
by: Huang, Haojian, et al.
Published: (2025)
by: Huang, Haojian, et al.
Published: (2025)
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
by: Jia, Hongrui, et al.
Published: (2024)
by: Jia, Hongrui, et al.
Published: (2024)
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
by: Wu, Ziyi, et al.
Published: (2025)
by: Wu, Ziyi, et al.
Published: (2025)
$ϕ$-DPO: Fairness Direct Preference Optimization Approach to Continual Learning in Large Multimodal Models
by: Truong, Thanh-Dat, et al.
Published: (2026)
by: Truong, Thanh-Dat, et al.
Published: (2026)
SEE-DPO: Self Entropy Enhanced Direct Preference Optimization
by: Shekhar, Shivanshu, et al.
Published: (2024)
by: Shekhar, Shivanshu, et al.
Published: (2024)
VideoDPO: Omni-Preference Alignment for Video Diffusion Generation
by: Liu, Runtao, et al.
Published: (2024)
by: Liu, Runtao, et al.
Published: (2024)
DeDPO: Debiased Direct Preference Optimization for Diffusion Models
by: Pham, Khiem, et al.
Published: (2026)
by: Pham, Khiem, et al.
Published: (2026)
Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning
by: Ahn, Daechul, et al.
Published: (2025)
by: Ahn, Daechul, et al.
Published: (2025)
RealDPO: Real or Not Real, that is the Preference
by: Cheng, Guo, et al.
Published: (2025)
by: Cheng, Guo, et al.
Published: (2025)
Multi-Level Knowledge Distillation and Dynamic Self-Supervised Learning for Continual Learning
by: Kim, Taeheon, et al.
Published: (2025)
by: Kim, Taeheon, et al.
Published: (2025)
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
by: Yang, Zhihe, et al.
Published: (2025)
by: Yang, Zhihe, et al.
Published: (2025)
BideDPO: Conditional Image Generation with Simultaneous Text and Condition Alignment
by: Zhou, Dewei, et al.
Published: (2025)
by: Zhou, Dewei, et al.
Published: (2025)
BalancedDPO: Adaptive Multi-Metric Alignment
by: Tamboli, Dipesh, et al.
Published: (2025)
by: Tamboli, Dipesh, et al.
Published: (2025)
SyncDPO: Enhancing Temporal Synchronization in Video-Audio Joint Generation via Preference Learning
by: Cheng, Xin, et al.
Published: (2026)
by: Cheng, Xin, et al.
Published: (2026)
Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models
by: Li, Zejian, et al.
Published: (2025)
by: Li, Zejian, et al.
Published: (2025)
S2H-DPO: Hardness-Aware Preference Optimization for Vision-Language Models
by: Shukla, Nitish, et al.
Published: (2026)
by: Shukla, Nitish, et al.
Published: (2026)
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
by: Rao, Abinav, et al.
Published: (2026)
by: Rao, Abinav, et al.
Published: (2026)
V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models
by: Kim, Jisoo, et al.
Published: (2025)
by: Kim, Jisoo, et al.
Published: (2025)
PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
by: Wang, Peiyao, et al.
Published: (2025)
by: Wang, Peiyao, et al.
Published: (2025)
HuViDPO:Enhancing Video Generation through Direct Preference Optimization for Human-Centric Alignment
by: Jiang, Lifan, et al.
Published: (2025)
by: Jiang, Lifan, et al.
Published: (2025)
Region-Normalized DPO for Medical Image Segmentation under Noisy Judges
by: Kalisch, Hamza, et al.
Published: (2026)
by: Kalisch, Hamza, et al.
Published: (2026)
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs
by: Ouali, Yassine, et al.
Published: (2024)
by: Ouali, Yassine, et al.
Published: (2024)
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
by: Liu, Ziyu, et al.
Published: (2024)
by: Liu, Ziyu, et al.
Published: (2024)
Reg-DPO: SFT-Regularized Direct Preference Optimization with GT-Pair for Improving Video Generation
by: Du, Jie, et al.
Published: (2025)
by: Du, Jie, et al.
Published: (2025)
DPO Learning with LLMs-Judge Signal for Computer Use Agents
by: Luo, Man, et al.
Published: (2025)
by: Luo, Man, et al.
Published: (2025)
HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images
by: Yang, Yilin, et al.
Published: (2026)
by: Yang, Yilin, et al.
Published: (2026)
DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization
by: Ayupov, Shamil, et al.
Published: (2025)
by: Ayupov, Shamil, et al.
Published: (2025)
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
by: Ahn, Young Jin, et al.
Published: (2024)
by: Ahn, Young Jin, et al.
Published: (2024)
DEVIAS: Learning Disentangled Video Representations of Action and Scene
by: Bae, Kyungho, et al.
Published: (2023)
by: Bae, Kyungho, et al.
Published: (2023)
HIPPO-Video: Simulating Watch Histories with Large Language Models for Personalized Video Highlighting
by: Lee, Jeongeun, et al.
Published: (2025)
by: Lee, Jeongeun, et al.
Published: (2025)
Similar Items
-
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
by: Ahn, Daechul, et al.
Published: (2024) -
What Happens When: Learning Temporal Orders of Events in Videos
by: Ahn, Daechul, et al.
Published: (2025) -
AVC-DPO: Aligned Video Captioning via Direct Preference Optimization
by: Tang, Jiyang, et al.
Published: (2025) -
PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation
by: Huang, Qihan, et al.
Published: (2024) -
VL-DPO: Vision-Language-Guided Finetuning for Preference-Aligned Autonomous Driving
by: Xu, Zhefan, et al.
Published: (2026)