Saved in:
| Main Authors: | Xiang, Jianfeng, Chen, Xiaoxue, Xu, Sicheng, Wang, Ruicheng, Lv, Zelong, Deng, Yu, Zhu, Hongyuan, Dong, Yue, Zhao, Hao, Yuan, Nicholas Jing, Yang, Jiaolong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.14692 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Structured 3D Latents for Scalable and Versatile 3D Generation
by: Xiang, Jianfeng, et al.
Published: (2024)
by: Xiang, Jianfeng, et al.
Published: (2024)
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
by: Wang, Ruicheng, et al.
Published: (2025)
by: Wang, Ruicheng, et al.
Published: (2025)
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
by: Wang, Ruicheng, et al.
Published: (2024)
by: Wang, Ruicheng, et al.
Published: (2024)
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
by: Wang, Ruicheng, et al.
Published: (2024)
by: Wang, Ruicheng, et al.
Published: (2024)
Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data
by: Xu, Yizhao, et al.
Published: (2026)
by: Xu, Yizhao, et al.
Published: (2026)
Map2World: Segment Map Conditioned Text to 3D World Generation
by: Chung, Jaeyoung, et al.
Published: (2026)
by: Chung, Jaeyoung, et al.
Published: (2026)
Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis
by: Zhang, Bowen, et al.
Published: (2025)
by: Zhang, Bowen, et al.
Published: (2025)
VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image
by: Xu, Sicheng, et al.
Published: (2025)
by: Xu, Sicheng, et al.
Published: (2025)
SS4D: Native 4D Generative Model via Structured Spacetime Latents
by: Li, Zhibing, et al.
Published: (2025)
by: Li, Zhibing, et al.
Published: (2025)
HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
by: Liang, Huizhi, et al.
Published: (2026)
by: Liang, Huizhi, et al.
Published: (2026)
Real-Time Generation of Streamable Talking Portrait Video with Reference-Guided Deep Compression VAEs
by: Xu, Sicheng, et al.
Published: (2026)
by: Xu, Sicheng, et al.
Published: (2026)
LaFiTe: A Generative Latent Field for 3D Native Texturing
by: Chen, Chia-Hao, et al.
Published: (2025)
by: Chen, Chia-Hao, et al.
Published: (2025)
GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
by: Zhang, Bowen, et al.
Published: (2024)
by: Zhang, Bowen, et al.
Published: (2024)
TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
by: Zeng, Yifei, et al.
Published: (2025)
by: Zeng, Yifei, et al.
Published: (2025)
CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval
by: Sun, Zelong, et al.
Published: (2025)
by: Sun, Zelong, et al.
Published: (2025)
TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning
by: Feng, ZhiYuan, et al.
Published: (2026)
by: Feng, ZhiYuan, et al.
Published: (2026)
Morse Index Classification and Landscape of Kuramoto System for Hebbian-based Binary Pattern Recognition
by: Zhao, Xiaoxue, et al.
Published: (2025)
by: Zhao, Xiaoxue, et al.
Published: (2025)
Critical Points, Stability, and Basins of Attraction of Three Kuramoto Oscillators with Isosceles Triangle Network
by: Zhao, Xiaoxue, et al.
Published: (2024)
by: Zhao, Xiaoxue, et al.
Published: (2024)
Why does the two-timescale Q-learning converge to different mean field solutions? A unified convergence analysis
by: An, Jing, et al.
Published: (2024)
by: An, Jing, et al.
Published: (2024)
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
by: Xu, Sicheng, et al.
Published: (2024)
by: Xu, Sicheng, et al.
Published: (2024)
Towards Native Generative Model for 3D Head Avatar
by: Zhuang, Yiyu, et al.
Published: (2024)
by: Zhuang, Yiyu, et al.
Published: (2024)
Say Cheese! Detail-Preserving Portrait Collection Generation via Natural Language Edits
by: Sun, Zelong, et al.
Published: (2026)
by: Sun, Zelong, et al.
Published: (2026)
A Unified Post-Processing Framework for Group Fairness in Classification
by: Xian, Ruicheng, et al.
Published: (2024)
by: Xian, Ruicheng, et al.
Published: (2024)
Contrastive Conditional Latent Diffusion for Audio-visual Segmentation
by: Mao, Yuxin, et al.
Published: (2023)
by: Mao, Yuxin, et al.
Published: (2023)
Detail-Preserving Latent Diffusion for Stable Shadow Removal
by: Xu, Jiamin, et al.
Published: (2024)
by: Xu, Jiamin, et al.
Published: (2024)
CT-MVSNet: Efficient Multi-View Stereo with Cross-scale Transformer
by: Wang, Sicheng, et al.
Published: (2023)
by: Wang, Sicheng, et al.
Published: (2023)
Initial-State Typicality in Quantum Relaxation
by: Bao, Ruicheng
Published: (2025)
by: Bao, Ruicheng
Published: (2025)
RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
by: Zhang, Bowen, et al.
Published: (2024)
by: Zhang, Bowen, et al.
Published: (2024)
Preemptive Holistic Collaborative System and Its Application in Road Transportation
by: Li, Yuan, et al.
Published: (2024)
by: Li, Yuan, et al.
Published: (2024)
Generating Moving 3D Soundscapes with Latent Diffusion Models
by: Templin, Christian, et al.
Published: (2025)
by: Templin, Christian, et al.
Published: (2025)
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
by: Zhang, Ruicheng, et al.
Published: (2026)
by: Zhang, Ruicheng, et al.
Published: (2026)
4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives
by: Yang, Zeyu, et al.
Published: (2024)
by: Yang, Zeyu, et al.
Published: (2024)
Equivariant Neural Networks for General Linear Symmetries on Lie Algebras
by: Kim, Chankyo, et al.
Published: (2025)
by: Kim, Chankyo, et al.
Published: (2025)
Combined Crank‐Slider and Rack‐Gear Structures for Energy Harvesting and Impact Reduction of the Car Door
by: Zelong Zhao, et al.
Published: (2025)
by: Zelong Zhao, et al.
Published: (2025)
DC-Scene: Data-Centric Learning for 3D Scene Understanding
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
A Variational Autoencoder for Neural Temporal Point Processes with Dynamic Latent Graphs
by: Yang, Sikun, et al.
Published: (2023)
by: Yang, Sikun, et al.
Published: (2023)
Holistic view of the road transportation system based on real-time data sharing mechanism
by: Li, Tao, et al.
Published: (2024)
by: Li, Tao, et al.
Published: (2024)
Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks
by: Wang, Junying, et al.
Published: (2025)
by: Wang, Junying, et al.
Published: (2025)
Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
by: Zhu, Jiankun, et al.
Published: (2024)
by: Zhu, Jiankun, et al.
Published: (2024)
TEFormer: Structured Bidirectional Temporal Enhancement Modeling in Spiking Transformers
by: Shen, Sicheng, et al.
Published: (2026)
by: Shen, Sicheng, et al.
Published: (2026)
Similar Items
-
Structured 3D Latents for Scalable and Versatile 3D Generation
by: Xiang, Jianfeng, et al.
Published: (2024) -
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
by: Wang, Ruicheng, et al.
Published: (2025) -
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
by: Wang, Ruicheng, et al.
Published: (2024) -
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors
by: Wang, Ruicheng, et al.
Published: (2024) -
Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data
by: Xu, Yizhao, et al.
Published: (2026)