:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Xiaoming, Colburn, Alex, Ma, Fangchang, Bautista, Miguel Angel, Susskind, Joshua M., Schwing, Alexander G.
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.08587
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

World-consistent Video Diffusion with Explicit 3D Modeling
by: Zhang, Qihang, et al.
Published: (2024)

Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
by: Zhao, Xiaoming, et al.
Published: (2025)

GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh
by: Wen, Jing, et al.
Published: (2024)

CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models
by: Stracke, Nick, et al.
Published: (2024)

Learning Long-term Motion Embeddings for Efficient Kinematics Generation
by: Stracke, Nick, et al.
Published: (2026)

Scalable Pre-training of Large Autoregressive Image Models
by: El-Nouby, Alaaeldin, et al.
Published: (2024)

3D Shape Tokenization via Latent Flow Matching
by: Chang, Jen-Hao Rick, et al.
Published: (2024)

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
by: Gui, Ming, et al.
Published: (2025)

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows
by: Tang, Zhenggang, et al.
Published: (2024)

STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flows
by: Gu, Jiatao, et al.
Published: (2025)

OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning
by: Choudhuri, Anwesa, et al.
Published: (2024)

STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation
by: Shen, Ying, et al.
Published: (2026)

VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation
by: Ren, Hui, et al.
Published: (2026)

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation
by: Cheng, Ho Kei, et al.
Published: (2025)

Variational Rectified Flow Matching
by: Guo, Pengsheng, et al.
Published: (2025)

NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses
by: Wen, Jing, et al.
Published: (2025)

LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
by: Wen, Jing, et al.
Published: (2025)

SimpliHuMoN: Simplifying Human Motion Prediction
by: Agrawal, Aadya, et al.
Published: (2026)

Decoupling Dynamic Monocular Videos for Dynamic View Synthesis
by: You, Meng, et al.
Published: (2023)

Towards Hierarchical Rectified Flow
by: Zhang, Yichi, et al.
Published: (2025)

Hierarchical Rectified Flow Matching with Mini-Batch Couplings
by: Zhang, Yichi, et al.
Published: (2025)

On Inductive Biases That Enable Generalization of Diffusion Transformers
by: An, Jie, et al.
Published: (2024)

Pixel-Aligned Multi-View Generation with Depth Guided Decoder
by: Tang, Zhenggang, et al.
Published: (2024)

Normalizing Flows are Capable Generative Models
by: Zhai, Shuangfei, et al.
Published: (2024)

Putting the Object Back into Video Object Segmentation
by: Cheng, Ho Kei, et al.
Published: (2023)

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
by: Gu, Jiatao, et al.
Published: (2025)

BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
by: Li, Yuguang, et al.
Published: (2025)

Diffusion Priors for Dynamic View Synthesis from Monocular Videos
by: Wang, Chaoyang, et al.
Published: (2024)

Dynamic View Synthesis from Small Camera Motion Videos
by: Sun, Huiqiang, et al.
Published: (2025)

Many-to-many Image Generation with Auto-regressive Diffusion Models
by: Shen, Ying, et al.
Published: (2024)

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations
by: Khosla, Savya, et al.
Published: (2024)

MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
by: Cheng, Ho Kei, et al.
Published: (2024)

Pseudo-View Enhancement via Confidence Fusion for Unposed Sparse-View Reconstruction
by: Zhao, Beizhen, et al.
Published: (2026)

T-REN: Learning Text-Aligned Region Tokens Improves Dense Vision-Language Alignment and Scalability
by: Khosla, Savya, et al.
Published: (2026)

Broadening View Synthesis of Dynamic Scenes from Constrained Monocular Videos
by: Jiang, Le, et al.
Published: (2025)

Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos
by: Stearns, Colton, et al.
Published: (2024)

MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
by: Tang, Zhenggang, et al.
Published: (2024)

Novel View Synthesis as Video Completion
by: Wu, Qi, et al.
Published: (2026)

FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation
by: Teng, Wenbin, et al.
Published: (2025)

Pseudo Dataset Generation for Out-of-Domain Multi-Camera View Recommendation
by: Lee, Kuan-Ying, et al.
Published: (2024)