:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xiang, Jianfeng, Chen, Xiaoxue, Xu, Sicheng, Wang, Ruicheng, Lv, Zelong, Deng, Yu, Zhu, Hongyuan, Dong, Yue, Zhao, Hao, Yuan, Nicholas Jing, Yang, Jiaolong
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.14692
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Structured 3D Latents for Scalable and Versatile 3D Generation
by: Xiang, Jianfeng, et al.
Published: (2024)

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
by: Wang, Ruicheng, et al.
Published: (2025)

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
by: Wang, Ruicheng, et al.
Published: (2024)

Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
by: Wang, Ruicheng, et al.
Published: (2024)

Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data
by: Xu, Yizhao, et al.
Published: (2026)

Map2World: Segment Map Conditioned Text to 3D World Generation
by: Chung, Jaeyoung, et al.
Published: (2026)

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
by: Zhang, Bowen, et al.
Published: (2025)

VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image
by: Xu, Sicheng, et al.
Published: (2025)

SS4D: Native 4D Generative Model via Structured Spacetime Latents
by: Li, Zhibing, et al.
Published: (2025)

HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
by: Liang, Huizhi, et al.
Published: (2026)

Real-Time Generation of Streamable Talking Portrait Video with Reference-Guided Deep Compression VAEs
by: Xu, Sicheng, et al.
Published: (2026)

LaFiTe: A Generative Latent Field for 3D Native Texturing
by: Chen, Chia-Hao, et al.
Published: (2025)

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
by: Zhang, Bowen, et al.
Published: (2024)

TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
by: Zeng, Yifei, et al.
Published: (2025)

CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval
by: Sun, Zelong, et al.
Published: (2025)

TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning
by: Feng, ZhiYuan, et al.
Published: (2026)

Morse Index Classification and Landscape of Kuramoto System for Hebbian-based Binary Pattern Recognition
by: Zhao, Xiaoxue, et al.
Published: (2025)

Critical Points, Stability, and Basins of Attraction of Three Kuramoto Oscillators with Isosceles Triangle Network
by: Zhao, Xiaoxue, et al.
Published: (2024)

Why does the two-timescale Q-learning converge to different mean field solutions? A unified convergence analysis
by: An, Jing, et al.
Published: (2024)

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
by: Xu, Sicheng, et al.
Published: (2024)

Towards Native Generative Model for 3D Head Avatar
by: Zhuang, Yiyu, et al.
Published: (2024)

Say Cheese! Detail-Preserving Portrait Collection Generation via Natural Language Edits
by: Sun, Zelong, et al.
Published: (2026)

A Unified Post-Processing Framework for Group Fairness in Classification
by: Xian, Ruicheng, et al.
Published: (2024)

Contrastive Conditional Latent Diffusion for Audio-visual Segmentation
by: Mao, Yuxin, et al.
Published: (2023)

Detail-Preserving Latent Diffusion for Stable Shadow Removal
by: Xu, Jiamin, et al.
Published: (2024)

CT-MVSNet: Efficient Multi-View Stereo with Cross-scale Transformer
by: Wang, Sicheng, et al.
Published: (2023)

Initial-State Typicality in Quantum Relaxation
by: Bao, Ruicheng
Published: (2025)

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
by: Zhang, Bowen, et al.
Published: (2024)

Preemptive Holistic Collaborative System and Its Application in Road Transportation
by: Li, Yuan, et al.
Published: (2024)

Generating Moving 3D Soundscapes with Latent Diffusion Models
by: Templin, Christian, et al.
Published: (2025)

KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
by: Zhang, Ruicheng, et al.
Published: (2026)

4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives
by: Yang, Zeyu, et al.
Published: (2024)

Equivariant Neural Networks for General Linear Symmetries on Lie Algebras
by: Kim, Chankyo, et al.
Published: (2025)

Combined Crank‐Slider and Rack‐Gear Structures for Energy Harvesting and Impact Reduction of the Car Door
by: Zelong Zhao, et al.
Published: (2025)

DC-Scene: Data-Centric Learning for 3D Scene Understanding
by: Huang, Ting, et al.
Published: (2025)

A Variational Autoencoder for Neural Temporal Point Processes with Dynamic Latent Graphs
by: Yang, Sikun, et al.
Published: (2023)

Holistic view of the road transportation system based on real-time data sharing mechanism
by: Li, Tao, et al.
Published: (2024)

Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks
by: Wang, Junying, et al.
Published: (2025)

Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
by: Zhu, Jiankun, et al.
Published: (2024)

TEFormer: Structured Bidirectional Temporal Enhancement Modeling in Spiking Transformers
by: Shen, Sicheng, et al.
Published: (2026)