Saved in:
| Main Authors: | Mo, Sicheng, Nguyen, Thao, Zhang, Richard, Kolkin, Nick, Iyer, Siddharth Srinivasan, Shechtman, Eli, Singh, Krishna Kumar, Lee, Yong Jae, Zhou, Bolei, Li, Yuheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.10954 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Relational Visual Similarity
by: Nguyen, Thao, et al.
Published: (2025)
by: Nguyen, Thao, et al.
Published: (2025)
X-Fusion: Introducing New Modality to Frozen Large Language Models
by: Mo, Sicheng, et al.
Published: (2025)
by: Mo, Sicheng, et al.
Published: (2025)
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
by: Gandikota, Rohit, et al.
Published: (2025)
by: Gandikota, Rohit, et al.
Published: (2025)
TurboEdit: Instant text-based image editing
by: Wu, Zongze, et al.
Published: (2024)
by: Wu, Zongze, et al.
Published: (2024)
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
by: Li, Yuheng, et al.
Published: (2024)
by: Li, Yuheng, et al.
Published: (2024)
YoChameleon: Personalized Vision and Language Generation
by: Nguyen, Thao, et al.
Published: (2025)
by: Nguyen, Thao, et al.
Published: (2025)
Edit One for All: Interactive Batch Image Editing
by: Nguyen, Thao, et al.
Published: (2024)
by: Nguyen, Thao, et al.
Published: (2024)
Learning an Image Editing Model without Image Editing Pairs
by: Kumari, Nupur, et al.
Published: (2025)
by: Kumari, Nupur, et al.
Published: (2025)
Image Neural Field Diffusion Models
by: Chen, Yinbo, et al.
Published: (2024)
by: Chen, Yinbo, et al.
Published: (2024)
Yo'LLaVA: Your Personalized Language and Vision Assistant
by: Nguyen, Thao, et al.
Published: (2024)
by: Nguyen, Thao, et al.
Published: (2024)
Self-Evaluation Unlocks Any-Step Text-to-Image Generation
by: Yu, Xin, et al.
Published: (2025)
by: Yu, Xin, et al.
Published: (2025)
Customizing Text-to-Image Diffusion with Object Viewpoint Control
by: Kumari, Nupur, et al.
Published: (2024)
by: Kumari, Nupur, et al.
Published: (2024)
Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance
by: Lin, Kuan Heng, et al.
Published: (2024)
by: Lin, Kuan Heng, et al.
Published: (2024)
Improved Baselines with Representation Autoencoders
by: Singh, Jaskirat, et al.
Published: (2026)
by: Singh, Jaskirat, et al.
Published: (2026)
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
by: Huang, Xun, et al.
Published: (2025)
by: Huang, Xun, et al.
Published: (2025)
One-step Diffusion with Distribution Matching Distillation
by: Yin, Tianwei, et al.
Published: (2023)
by: Yin, Tianwei, et al.
Published: (2023)
Lazy Diffusion Transformer for Interactive Image Editing
by: Nitzan, Yotam, et al.
Published: (2024)
by: Nitzan, Yotam, et al.
Published: (2024)
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
by: Yin, Tianwei, et al.
Published: (2024)
by: Yin, Tianwei, et al.
Published: (2024)
Dreamland: Controllable World Creation with Simulator and Generative Models
by: Mo, Sicheng, et al.
Published: (2025)
by: Mo, Sicheng, et al.
Published: (2025)
Jump Cut Smoothing for Talking Heads
by: Wang, Xiaojuan, et al.
Published: (2024)
by: Wang, Xiaojuan, et al.
Published: (2024)
Lifting for Arbitrary Gadgets
by: Iyer, Siddharth
Published: (2025)
by: Iyer, Siddharth
Published: (2025)
Gaps between quadratic forms
by: Iyer, Siddharth
Published: (2025)
by: Iyer, Siddharth
Published: (2025)
Cubic Polynomials and Sums of Two Squares
by: Iyer, Siddharth
Published: (2025)
by: Iyer, Siddharth
Published: (2025)
Distribution of sums of square roots modulo $1$
by: Iyer, Siddharth
Published: (2024)
by: Iyer, Siddharth
Published: (2024)
Rational approximation with digit-restricted denominators
by: Iyer, Siddharth
Published: (2023)
by: Iyer, Siddharth
Published: (2023)
Distribution of $θ-$powers and their sums
by: Iyer, Siddharth
Published: (2025)
by: Iyer, Siddharth
Published: (2025)
On the Digits of Partition Functions
by: Iyer, Siddharth
Published: (2026)
by: Iyer, Siddharth
Published: (2026)
Causality in Video Diffusers is Separable from Denoising
by: Bai, Xingjian, et al.
Published: (2026)
by: Bai, Xingjian, et al.
Published: (2026)
Personal Visual Memory from Explicit and Implicit Evidence
by: Nguyen, Viet, et al.
Published: (2026)
by: Nguyen, Viet, et al.
Published: (2026)
GD doesn't make the cut: Three ways that non-differentiability affects neural network training
by: Kumar, Siddharth Krishna
Published: (2024)
by: Kumar, Siddharth Krishna
Published: (2024)
What matters for Representation Alignment: Global Information or Spatial Structure?
by: Singh, Jaskirat, et al.
Published: (2025)
by: Singh, Jaskirat, et al.
Published: (2025)
From Plans to Pixels: Learning to Plan and Orchestrate for Open-Ended Image Editing
by: Rajan, Anirudh Sundara, et al.
Published: (2026)
by: Rajan, Anirudh Sundara, et al.
Published: (2026)
SimGen: Simulator-conditioned Driving Scene Generation
by: Zhou, Yunsong, et al.
Published: (2024)
by: Zhou, Yunsong, et al.
Published: (2024)
Distilling Diffusion Models into Conditional GANs
by: Kang, Minguk, et al.
Published: (2024)
by: Kang, Minguk, et al.
Published: (2024)
NewMove: Customizing text-to-video models with novel motions
by: Materzynska, Joanna, et al.
Published: (2023)
by: Materzynska, Joanna, et al.
Published: (2023)
Sampling plans for pre-packed fish and fish products
by: Krishna Iyer, H., et al.
Published: (1984)
by: Krishna Iyer, H., et al.
Published: (1984)
GroupDiff: Diffusion-based Group Portrait Editing
by: Jiang, Yuming, et al.
Published: (2024)
by: Jiang, Yuming, et al.
Published: (2024)
Detection of lumpy skin disease virus reads in the human upper respiratory tract microbiome requires further investigation
by: Siddharth Singh Tomar, et al.
Published: (2024)
by: Siddharth Singh Tomar, et al.
Published: (2024)
SPLD polynomial optimization and bounded degree SOS hierarchies
by: Jiao, Liguo, et al.
Published: (2025)
by: Jiao, Liguo, et al.
Published: (2025)
MoGAN: Improving Motion Quality in Video Diffusion via Few-Step Motion Adversarial Post-Training
by: Xue, Haotian, et al.
Published: (2025)
by: Xue, Haotian, et al.
Published: (2025)
Similar Items
-
Relational Visual Similarity
by: Nguyen, Thao, et al.
Published: (2025) -
X-Fusion: Introducing New Modality to Frozen Large Language Models
by: Mo, Sicheng, et al.
Published: (2025) -
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models
by: Gandikota, Rohit, et al.
Published: (2025) -
TurboEdit: Instant text-based image editing
by: Wu, Zongze, et al.
Published: (2024) -
Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
by: Li, Yuheng, et al.
Published: (2024)