Saved in:
| Main Authors: | Li, Shufan, Kallidromitis, Konstantinos, Gokul, Akash, Kato, Yusuke, Kozuka, Kazuki |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.04465 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
by: Li, Shufan, et al.
Published: (2024)
by: Li, Shufan, et al.
Published: (2024)
LaViDa: A Large Diffusion Language Model for Multimodal Understanding
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
Guidance Contrastive Token Credit Assignment for Discrete Policy Optimization
by: Li, Shufan, et al.
Published: (2026)
by: Li, Shufan, et al.
Published: (2026)
SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)
by: Wang, XuDong, et al.
Published: (2024)
MobileWorldBench: Towards Semantic World Modeling For Mobile Agents
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
Wild2Avatar: Rendering Humans Behind Occlusions
by: Xiang, Tiange, et al.
Published: (2023)
by: Xiang, Tiange, et al.
Published: (2023)
PopAlign: Population-Level Alignment for Fair Text-to-Image Generation
by: Li, Shufan, et al.
Published: (2024)
by: Li, Shufan, et al.
Published: (2024)
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models
by: Purushwalkam, Senthil, et al.
Published: (2024)
by: Purushwalkam, Senthil, et al.
Published: (2024)
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation
by: Patel, Chaitanya, et al.
Published: (2025)
by: Patel, Chaitanya, et al.
Published: (2025)
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
by: Lee, Kyungmin, et al.
Published: (2025)
by: Lee, Kyungmin, et al.
Published: (2025)
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
by: Franco, Luca, et al.
Published: (2023)
by: Franco, Luca, et al.
Published: (2023)
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
by: Liu, Kendong, et al.
Published: (2024)
by: Liu, Kendong, et al.
Published: (2024)
Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models
by: Wang, Fu-Yun, et al.
Published: (2025)
by: Wang, Fu-Yun, et al.
Published: (2025)
PC-Diffusion: Aligning Diffusion Models with Human Preferences via Preference Classifier
by: Wang, Shaomeng, et al.
Published: (2025)
by: Wang, Shaomeng, et al.
Published: (2025)
Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences
by: Lu, Yunhong, et al.
Published: (2025)
by: Lu, Yunhong, et al.
Published: (2025)
Diagnosing Vision Language Models' Perception by Leveraging Human Methods for Color Vision Deficiencies
by: Hayashi, Kazuki, et al.
Published: (2025)
by: Hayashi, Kazuki, et al.
Published: (2025)
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
by: Li, Shufan, et al.
Published: (2024)
by: Li, Shufan, et al.
Published: (2024)
ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization
by: Shen, Wenhao, et al.
Published: (2025)
by: Shen, Wenhao, et al.
Published: (2025)
Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation
by: Li, Shufan, et al.
Published: (2025)
by: Li, Shufan, et al.
Published: (2025)
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
by: Hong, Jiwoo, et al.
Published: (2024)
by: Hong, Jiwoo, et al.
Published: (2024)
MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences
by: Wang, Weitao, et al.
Published: (2024)
by: Wang, Weitao, et al.
Published: (2024)
ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models
by: Sorokin, Dmitrii, et al.
Published: (2025)
by: Sorokin, Dmitrii, et al.
Published: (2025)
AlignTok: Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models
by: Chen, Bowei, et al.
Published: (2025)
by: Chen, Bowei, et al.
Published: (2025)
ColorizeDiffusion v2: Enhancing Reference-based Sketch Colorization Through Separating Utilities
by: Yan, Dingkun, et al.
Published: (2025)
by: Yan, Dingkun, et al.
Published: (2025)
InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following
by: Li, Shufan, et al.
Published: (2023)
by: Li, Shufan, et al.
Published: (2023)
Few-Shot Classification of Interactive Activities of Daily Living (InteractADL)
by: Durante, Zane, et al.
Published: (2024)
by: Durante, Zane, et al.
Published: (2024)
Diffusion-RPO: Aligning Diffusion Models through Relative Preference Optimization
by: Gu, Yi, et al.
Published: (2024)
by: Gu, Yi, et al.
Published: (2024)
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
by: Qin, Yiming, et al.
Published: (2025)
by: Qin, Yiming, et al.
Published: (2025)
OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization
by: Zhang, Jiacheng, et al.
Published: (2024)
by: Zhang, Jiacheng, et al.
Published: (2024)
D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples
by: Hu, Zijing, et al.
Published: (2025)
by: Hu, Zijing, et al.
Published: (2025)
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models
by: Sabour, Amirmojtaba, et al.
Published: (2024)
by: Sabour, Amirmojtaba, et al.
Published: (2024)
Towards Artwork Explanation in Large-scale Vision Language Models
by: Hayashi, Kazuki, et al.
Published: (2024)
by: Hayashi, Kazuki, et al.
Published: (2024)
LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models
by: Li, Shufan, et al.
Published: (2026)
by: Li, Shufan, et al.
Published: (2026)
SHARE: Scene-Human Aligned Reconstruction
by: Li, Joshua, et al.
Published: (2025)
by: Li, Joshua, et al.
Published: (2025)
DogWeave: High-Fidelity 3D Canine Reconstruction from a Single Image via Normal Fusion and Conditional Inpainting
by: Sun, Shufan, et al.
Published: (2026)
by: Sun, Shufan, et al.
Published: (2026)
Accelerating Inference of Masked Image Generators via Reinforcement Learning
by: Subbaraman, Pranav, et al.
Published: (2025)
by: Subbaraman, Pranav, et al.
Published: (2025)
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
by: Chen, Chen, et al.
Published: (2025)
by: Chen, Chen, et al.
Published: (2025)
Aligned Contrastive Loss for Long-Tailed Recognition
by: Ma, Jiali, et al.
Published: (2025)
by: Ma, Jiali, et al.
Published: (2025)
Similar Items
-
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
by: Li, Shufan, et al.
Published: (2025) -
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
by: Li, Shufan, et al.
Published: (2024) -
LaViDa: A Large Diffusion Language Model for Multimodal Understanding
by: Li, Shufan, et al.
Published: (2025) -
Guidance Contrastive Token Credit Assignment for Discrete Policy Optimization
by: Li, Shufan, et al.
Published: (2026) -
SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)