Saved in:
| Main Authors: | Habibian, Amirhossein, Ghodrati, Amir, Fathima, Noor, Sautiere, Guillaume, Garrepalli, Risheek, Porikli, Fatih, Petersen, Jens |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.08128 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MoViE: Mobile Diffusion for Video Editing
by: Karjauv, Adil, et al.
Published: (2024)
by: Karjauv, Adil, et al.
Published: (2024)
DDIL: Diversity Enhancing Diffusion Distillation With Imitation Learning
by: Garrepalli, Risheek, et al.
Published: (2024)
by: Garrepalli, Risheek, et al.
Published: (2024)
Multi-Scale Local Speculative Decoding for Image Generation
by: Peruzzo, Elia, et al.
Published: (2026)
by: Peruzzo, Elia, et al.
Published: (2026)
MADI: Masking-Augmented Diffusion with Inference-Time Scaling for Visual Editing
by: Kadambi, Shreya, et al.
Published: (2025)
by: Kadambi, Shreya, et al.
Published: (2025)
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
by: Yasarla, Rajeev, et al.
Published: (2023)
by: Yasarla, Rajeev, et al.
Published: (2023)
SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations
by: Lin, Jamie Menjay, et al.
Published: (2024)
by: Lin, Jamie Menjay, et al.
Published: (2024)
OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation
by: Jeong, Jisoo, et al.
Published: (2024)
by: Jeong, Jisoo, et al.
Published: (2024)
Object-Centric Diffusion for Efficient Video Editing
by: Kahatapitiya, Kumara, et al.
Published: (2024)
by: Kahatapitiya, Kumara, et al.
Published: (2024)
Neodragon: Mobile Video Generation using Diffusion Transformer
by: Karnewar, Animesh, et al.
Published: (2025)
by: Karnewar, Animesh, et al.
Published: (2025)
Mobile Video Diffusion
by: Yahia, Haitam Ben, et al.
Published: (2024)
by: Yahia, Haitam Ben, et al.
Published: (2024)
Distilling Multi-modal Large Language Models for Autonomous Driving
by: Hegde, Deepti, et al.
Published: (2025)
by: Hegde, Deepti, et al.
Published: (2025)
MultiHuman-Testbench: Benchmarking Image Generation for Multiple Humans
by: Borse, Shubhankar, et al.
Published: (2025)
by: Borse, Shubhankar, et al.
Published: (2025)
Controllable 3D Placement of Objects with Scene-Aware Diffusion Models
by: Omran, Mohamed, et al.
Published: (2025)
by: Omran, Mohamed, et al.
Published: (2025)
Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection
by: Petersen, Jens, et al.
Published: (2025)
by: Petersen, Jens, et al.
Published: (2025)
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
by: Yasarla, Rajeev, et al.
Published: (2024)
by: Yasarla, Rajeev, et al.
Published: (2024)
ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers
by: Ghafoorian, Mohsen, et al.
Published: (2026)
by: Ghafoorian, Mohsen, et al.
Published: (2026)
Attention Surgery: An Efficient Recipe to Linearize Your Video Diffusion Transformer
by: Ghafoorian, Mohsen, et al.
Published: (2025)
by: Ghafoorian, Mohsen, et al.
Published: (2025)
RoCA: Robust Cross-Domain End-to-End Autonomous Driving
by: Yasarla, Rajeev, et al.
Published: (2025)
by: Yasarla, Rajeev, et al.
Published: (2025)
FouRA: Fourier Low Rank Adaptation
by: Borse, Shubhankar, et al.
Published: (2024)
by: Borse, Shubhankar, et al.
Published: (2024)
Hybrid Gaussian Splatting for Novel Urban View Synthesis
by: Omran, Mohamed, et al.
Published: (2025)
by: Omran, Mohamed, et al.
Published: (2025)
Gated Relational Alignment via Confidence-based Distillation for Efficient VLMs
by: Chen, Yanlong, et al.
Published: (2026)
by: Chen, Yanlong, et al.
Published: (2026)
Generative Scenario Rollouts for End-to-End Autonomous Driving
by: Yasarla, Rajeev, et al.
Published: (2026)
by: Yasarla, Rajeev, et al.
Published: (2026)
Enhancing Novel View Synthesis via Geometry Grounded Set Diffusion
by: Zanjani, Farhad G., et al.
Published: (2026)
by: Zanjani, Farhad G., et al.
Published: (2026)
Gaussian Splatting is an Effective Data Generator for 3D Object Detection
by: Zanjani, Farhad G., et al.
Published: (2025)
by: Zanjani, Farhad G., et al.
Published: (2025)
Segmentation-Free Guidance for Text-to-Image Diffusion Models
by: Azarian, Kambiz, et al.
Published: (2024)
by: Azarian, Kambiz, et al.
Published: (2024)
PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference
by: Korzhenkov, Denis, et al.
Published: (2026)
by: Korzhenkov, Denis, et al.
Published: (2026)
Low-Latency Neural Stereo Streaming
by: Hou, Qiqi, et al.
Published: (2024)
by: Hou, Qiqi, et al.
Published: (2024)
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation
by: Mercier, Antoine, et al.
Published: (2024)
by: Mercier, Antoine, et al.
Published: (2024)
MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models
by: Bhowmik, Aritra, et al.
Published: (2025)
by: Bhowmik, Aritra, et al.
Published: (2025)
CoReDiT: Spatial Coherence-Guided Token Pruning and Reconstruction for Efficient Diffusion Transformers
by: Li, Zhuojin, et al.
Published: (2026)
by: Li, Zhuojin, et al.
Published: (2026)
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
by: Lee, Jungsoo, et al.
Published: (2025)
by: Lee, Jungsoo, et al.
Published: (2025)
H3O: Hyper-Efficient 3D Occupancy Prediction with Heterogeneous Supervision
by: Shi, Yunxiao, et al.
Published: (2025)
by: Shi, Yunxiao, et al.
Published: (2025)
Imagining the Unseen: Generative Location Modeling for Object Placement
by: Yun, Jooyeol, et al.
Published: (2024)
by: Yun, Jooyeol, et al.
Published: (2024)
Resolving the Identity Crisis in Text-to-Image Generation
by: Borse, Shubhankar, et al.
Published: (2025)
by: Borse, Shubhankar, et al.
Published: (2025)
DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos
by: Yasarla, Rajeev, et al.
Published: (2025)
by: Yasarla, Rajeev, et al.
Published: (2025)
LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation
by: Farhadzadeh, Farzad, et al.
Published: (2025)
by: Farhadzadeh, Farzad, et al.
Published: (2025)
EdgeRelight360: Text-Conditioned 360-Degree HDR Image Generation for Real-Time On-Device Video Portrait Relighting
by: Lin, Min-Hui, et al.
Published: (2024)
by: Lin, Min-Hui, et al.
Published: (2024)
ReDiF: Reinforced Distillation for Few Step Diffusion
by: Tighkhorshid, Amirhossein, et al.
Published: (2025)
by: Tighkhorshid, Amirhossein, et al.
Published: (2025)
ToSA: Token Selective Attention for Efficient Vision Transformers
by: Singh, Manish Kumar, et al.
Published: (2024)
by: Singh, Manish Kumar, et al.
Published: (2024)
Attention Guided Alignment in Efficient Vision-Language Models
by: Mahajan, Shweta, et al.
Published: (2025)
by: Mahajan, Shweta, et al.
Published: (2025)
Similar Items
-
MoViE: Mobile Diffusion for Video Editing
by: Karjauv, Adil, et al.
Published: (2024) -
DDIL: Diversity Enhancing Diffusion Distillation With Imitation Learning
by: Garrepalli, Risheek, et al.
Published: (2024) -
Multi-Scale Local Speculative Decoding for Image Generation
by: Peruzzo, Elia, et al.
Published: (2026) -
MADI: Masking-Augmented Diffusion with Inference-Time Scaling for Visual Editing
by: Kadambi, Shreya, et al.
Published: (2025) -
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
by: Yasarla, Rajeev, et al.
Published: (2023)