:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wu, Haoyu, Karumuri, Meher Gitika, Zou, Chuhang, Bang, Seungbae, Li, Yuelong, Samaras, Dimitris, Hadap, Sunil
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.10947
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation
by: Li, Yuelong, et al.
Published: (2024)

Importance-Based Token Merging for Efficient Image and Video Generation
by: Wu, Haoyu, et al.
Published: (2024)

Learning 3D Reconstruction with Priors in Test Time
by: Zhou, Lei, et al.
Published: (2026)

One Attention, One Scale: Phase-Aligned Rotary Positional Embeddings for Mixed-Resolution Diffusion Transformer
by: Wu, Haoyu, et al.
Published: (2025)

MVGBench: Comprehensive Benchmark for Multi-view Generation Models
by: Xie, Xianghui, et al.
Published: (2025)

Self-supervised co-salient object detection via feature correspondence at multiple scales
by: Chakraborty, Souradeep, et al.
Published: (2024)

MI-NeRF: Learning a Single Face NeRF from Multiple Identities
by: Chatziagapi, Aggelina, et al.
Published: (2024)

Assessing Sample Quality via the Latent Space of Generative Models
by: Xu, Jingyi, et al.
Published: (2024)

Learning Relighting and Intrinsic Decomposition in Neural Radiance Fields
by: Yang, Yixiong, et al.
Published: (2024)

MLI-NeRF: Multi-Light Intrinsic-Aware Neural Radiance Fields
by: Yang, Yixiong, et al.
Published: (2024)

GriDiT: Factorized Grid-Based Diffusion for Efficient Long Image Sequence Generation
by: Tomar, Snehal Singh, et al.
Published: (2025)

JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation
by: Chakkera, Sai Tanmay Reddy, et al.
Published: (2024)

TopoDiffusionNet: A Topology-aware Diffusion Model
by: Gupta, Saumya, et al.
Published: (2024)

Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation
by: Howlader, Prantik, et al.
Published: (2024)

Direct May Not Be the Best: An Incremental Evolution View of Pose Generation
by: Li, Yuelong, et al.
Published: (2024)

Talking Head Generation via AU-Guided Landmark Prediction
by: Chang, Shao-Yu, et al.
Published: (2025)

Scalable and Realistic Virtual Try-on Application for Foundation Makeup with Kubelka-Munk Theory
by: Pang, Hui, et al.
Published: (2025)

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition
by: Chatziagapi, Aggelina, et al.
Published: (2024)

Embedding Physical Reasoning into Diffusion-Based Shadow Generation
by: Hu, Shilin, et al.
Published: (2025)

Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos
by: Rivero, Alfredo, et al.
Published: (2024)

Fast constrained sampling in pre-trained diffusion models
by: Graikos, Alexandros, et al.
Published: (2024)

Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
by: Hu, Xiaodan, et al.
Published: (2025)

Fast SAM 3D Body: Accelerating SAM 3D Body for Real-Time Full-Body Human Mesh Recovery
by: Yang, Timing, et al.
Published: (2026)

PhyMix: Towards Physically Consistent Single-Image 3D Indoor Scene Generation with Implicit--Explicit Optimization
by: Wu, Dongli, et al.
Published: (2026)

Beyond Pixels: Semi-Supervised Semantic Segmentation with a Multi-scale Patch-based Multi-Label Classifier
by: Howlader, Prantik, et al.
Published: (2024)

MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance
by: Wu, Yuqun, et al.
Published: (2024)

Plenoptic PNG: Real-Time Neural Radiance Fields in 150 KB
by: Lee, Jae Yong, et al.
Published: (2024)

What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards
by: Le, Minh-Quan, et al.
Published: (2025)

PathSegDiff: Pathology Segmentation using Diffusion model representations
by: Danisetty, Sachin Kumar, et al.
Published: (2025)

$\infty$-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions
by: Le, Minh-Quan, et al.
Published: (2024)

Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
by: Le, Minh-Quan, et al.
Published: (2025)

Phrase-Instance Alignment for Generalized Referring Segmentation
by: Nguyen, E-Ro, et al.
Published: (2024)

LBMamba: Locally Bi-directional Mamba
by: Zhang, Jingwei, et al.
Published: (2025)

Cast and Attached Shadow Detection via Iterative Light and Geometry Reasoning
by: Hu, Shilin, et al.
Published: (2025)

ESGaussianFace: Emotional and Stylized Audio-Driven Facial Animation via 3D Gaussian Splatting
by: Ma, Chuhang, et al.
Published: (2026)

Personalized Image Descriptions from Attention Sequences
by: Xue, Ruoyu, et al.
Published: (2025)

AS-Bridge: A Bidirectional Generative Framework Bridging Next-Generation Astronomical Surveys
by: Zhang, Dichang, et al.
Published: (2026)

Shadow Removal Refinement via Material-Consistent Shadow Edges
by: Hu, Shilin, et al.
Published: (2024)

Generating metamers of human scene understanding
by: Raina, Ritik, et al.
Published: (2026)

2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification
by: Zhang, Jingwei, et al.
Published: (2024)