:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Hongxiang, Dai, Xili, Wang, Jianan, Tong, Shengbang, Zhang, Jingyuan, Wang, Weida, Zhang, Lei, Ma, Yi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.10953
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
by: Chu, Tianzhe, et al.
Published: (2023)

Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
by: Huang, Yaxuan, et al.
Published: (2025)

GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
by: Wu, Jing, et al.
Published: (2024)

EmoCtrl: Controllable Emotional Image Content Generation
by: Yang, Jingyuan, et al.
Published: (2025)

ConsistEdit: Highly Consistent and Precise Training-free Visual Editing
by: Yin, Zixin, et al.
Published: (2025)

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views
by: Chen, Yabo, et al.
Published: (2023)

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis
by: Zhao, Chen, et al.
Published: (2024)

Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos
by: Wang, Ruoyu, et al.
Published: (2025)

NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects
by: Wang, Dongqing, et al.
Published: (2023)

BlobCtrl: Taming Controllable Blob for Element-level Image Editing
by: Li, Yaowei, et al.
Published: (2025)

Ctrl-VI: Controllable Video Synthesis via Variational Inference
by: Duan, Haoyi, et al.
Published: (2025)

Asymmetric Idiosyncrasies in Multimodal Models
by: Tao, Muzi, et al.
Published: (2026)

Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs
by: Yeh, Chun-Hsiao, et al.
Published: (2025)

Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning
by: Mo, Shentong, et al.
Published: (2024)

Enhancing Close-up Novel View Synthesis via Pseudo-labeling
by: Xia, Jiatong, et al.
Published: (2025)

CloseUpShot: Close-up Novel View Synthesis from Sparse-views via Point-conditioned Diffusion Model
by: Zhang, Yuqi, et al.
Published: (2025)

Pointmap-Conditioned Diffusion for Consistent Novel View Synthesis
by: Nguyen, Thang-Anh-Quan, et al.
Published: (2025)

From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models
by: Fang, Irving, et al.
Published: (2025)

Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention
by: Wang, Weida, et al.
Published: (2025)

Diffusion Transformers with Representation Autoencoders
by: Zheng, Boyang, et al.
Published: (2025)

View-Consistent 3D Editing with Gaussian Splatting
by: Wang, Yuxuan, et al.
Published: (2024)

SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera
by: Dai, Gaole, et al.
Published: (2024)

RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
by: Cao, Ke, et al.
Published: (2025)

Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis
by: Peng, Rui, et al.
Published: (2024)

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency
by: Zhu, Hanxin, et al.
Published: (2024)

BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images
by: Wang, Wentao, et al.
Published: (2024)

Reconstructing Topology-Consistent Face Mesh by Volume Rendering from Multi-View Images
by: Wang, Yating, et al.
Published: (2024)

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models
by: Ye, Jianglong, et al.
Published: (2023)

EndoCogniAgent: Closed-Loop Agentic Reasoning with Self-Consistency Validation for Endoscopic Diagnosis
by: Tang, Yi, et al.
Published: (2025)

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer
by: Yin, Zixin, et al.
Published: (2025)

NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods
by: Kulhanek, Jonas, et al.
Published: (2024)

[CLS] is Not Enough: Multi-Label Recognition via Patch-Level Inference and Adaptive Aggregation
by: Wang, Akang, et al.
Published: (2026)

High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion
by: Zhang, Xiang, et al.
Published: (2025)

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling
by: Zhang, Guiyu, et al.
Published: (2024)

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
by: Tong, Shengbang, et al.
Published: (2024)

CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
by: Xi, Dianbing, et al.
Published: (2025)

WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration
by: Ni, Chaojun, et al.
Published: (2025)

MeSS: City Mesh-Guided Outdoor Scene Generation with Cross-View Consistent Diffusion
by: Chen, Xuyang, et al.
Published: (2025)

Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering
by: Zhang, Yibo, et al.
Published: (2024)

XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized Manifold
by: Wang, Guangyu, et al.
Published: (2024)