Saved in:
| Main Authors: | Ren, Jiawei, Xu, Mengmeng, Wu, Jui-Chieh, Liu, Ziwei, Xiang, Tao, Toisoul, Antoine |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.07178 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Material Anything: Generating Materials for Any 3D Object via Diffusion
by: Huang, Xin, et al.
Published: (2024)
by: Huang, Xin, et al.
Published: (2024)
GenTron: Diffusion Transformers for Image and Video Generation
by: Chen, Shoufa, et al.
Published: (2023)
by: Chen, Shoufa, et al.
Published: (2023)
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving
by: Kong, Lingdong, et al.
Published: (2024)
by: Kong, Lingdong, et al.
Published: (2024)
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
by: Yang, Shuai, et al.
Published: (2024)
by: Yang, Shuai, et al.
Published: (2024)
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)
by: Liu, Haozhe, et al.
Published: (2024)
TouchAnything: Diffusion-Guided 3D Reconstruction from Sparse Robot Touches
by: Gu, Langzhe, et al.
Published: (2026)
by: Gu, Langzhe, et al.
Published: (2026)
Anything in Any Scene: Photorealistic Video Object Insertion
by: Bai, Chen, et al.
Published: (2024)
by: Bai, Chen, et al.
Published: (2024)
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
by: Cheng, Wei, et al.
Published: (2024)
by: Cheng, Wei, et al.
Published: (2024)
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
by: Wang, Hanyang, et al.
Published: (2025)
by: Wang, Hanyang, et al.
Published: (2025)
SAM Struggles in Concealed Scenes -- Empirical Study on Segment Anything
by: Ji, Ge-Peng, et al.
Published: (2023)
by: Ji, Ge-Peng, et al.
Published: (2023)
StructLDM: Structured Latent Diffusion for 3D Human Generation
by: Hu, Tao, et al.
Published: (2024)
by: Hu, Tao, et al.
Published: (2024)
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
by: Wang, Yuji, et al.
Published: (2025)
by: Wang, Yuji, et al.
Published: (2025)
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
by: Tang, Jiaxiang, et al.
Published: (2023)
by: Tang, Jiaxiang, et al.
Published: (2023)
Move-in-2D: 2D-Conditioned Human Motion Generation
by: Huang, Hsin-Ping, et al.
Published: (2024)
by: Huang, Hsin-Ping, et al.
Published: (2024)
Efficient Track Anything
by: Xiong, Yunyang, et al.
Published: (2024)
by: Xiong, Yunyang, et al.
Published: (2024)
Segment Anything for Video: A Comprehensive Review of Video Object Segmentation and Tracking from Past to Future
by: Xu, Guoping, et al.
Published: (2025)
by: Xu, Guoping, et al.
Published: (2025)
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
by: Cong, Yuren, et al.
Published: (2023)
by: Cong, Yuren, et al.
Published: (2023)
Faster Diffusion via Temporal Attention Decomposition
by: Liu, Haozhe, et al.
Published: (2024)
by: Liu, Haozhe, et al.
Published: (2024)
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
by: Liu, Fangfu, et al.
Published: (2025)
by: Liu, Fangfu, et al.
Published: (2025)
GeoSceneGraph: Geometric Scene Graph Diffusion Model for Text-guided 3D Indoor Scene Synthesis
by: Ruiz, Antonio, et al.
Published: (2025)
by: Ruiz, Antonio, et al.
Published: (2025)
Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis
by: Xu, Xiang, et al.
Published: (2026)
by: Xu, Xiang, et al.
Published: (2026)
RECITYGEN -- Interactive and Generative Participatory Urban Design Tool with Latent Diffusion and Segment Anything
by: Mo, Di, et al.
Published: (2026)
by: Mo, Di, et al.
Published: (2026)
Personalize Anything for Free with Diffusion Transformer
by: Feng, Haoran, et al.
Published: (2025)
by: Feng, Haoran, et al.
Published: (2025)
Count Anything at Any Granularity
by: Liu, Chang, et al.
Published: (2026)
by: Liu, Chang, et al.
Published: (2026)
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes
by: Xu, Xiang, et al.
Published: (2025)
by: Xu, Xiang, et al.
Published: (2025)
Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes
by: Zhou, Ke, et al.
Published: (2024)
by: Zhou, Ke, et al.
Published: (2024)
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image
by: Cao, Ziang, et al.
Published: (2025)
by: Cao, Ziang, et al.
Published: (2025)
AffordanceSAM: Segment Anything Once More in Affordance Grounding
by: Jiang, Dengyang, et al.
Published: (2025)
by: Jiang, Dengyang, et al.
Published: (2025)
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
by: Zhen, Dingcheng, et al.
Published: (2025)
by: Zhen, Dingcheng, et al.
Published: (2025)
4DNeX: Feed-Forward 4D Generative Modeling Made Easy
by: Chen, Zhaoxi, et al.
Published: (2025)
by: Chen, Zhaoxi, et al.
Published: (2025)
Taming Outlier Tokens in Diffusion Transformers
by: Wu, Xiaoyu, et al.
Published: (2026)
by: Wu, Xiaoyu, et al.
Published: (2026)
SynergyAmodal: Deocclude Anything with Text Control
by: Li, Xinyang, et al.
Published: (2025)
by: Li, Xinyang, et al.
Published: (2025)
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization
by: Liu, Qihao, et al.
Published: (2024)
by: Liu, Qihao, et al.
Published: (2024)
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
by: Lin, Weifeng, et al.
Published: (2025)
by: Lin, Weifeng, et al.
Published: (2025)
PDDM: Pseudo Depth Diffusion Model for RGB-PD Semantic Segmentation Based in Complex Indoor Scenes
by: Xu, Xinhua, et al.
Published: (2025)
by: Xu, Xinhua, et al.
Published: (2025)
DiffusionUavLoc: Visually Prompted Diffusion for Cross-View UAV Localization
by: Liu, Tao, et al.
Published: (2025)
by: Liu, Tao, et al.
Published: (2025)
3D Scene Generation: A Survey
by: Wen, Beichen, et al.
Published: (2025)
by: Wen, Beichen, et al.
Published: (2025)
Surgical Depth Anything: Depth Estimation for Surgical Scenes using Foundation Models
by: Lou, Ange, et al.
Published: (2024)
by: Lou, Ange, et al.
Published: (2024)
Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
by: Huang, Shiqi, et al.
Published: (2025)
by: Huang, Shiqi, et al.
Published: (2025)
Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts
by: Xie, Enze, et al.
Published: (2024)
by: Xie, Enze, et al.
Published: (2024)
Similar Items
-
Material Anything: Generating Materials for Any 3D Object via Diffusion
by: Huang, Xin, et al.
Published: (2024) -
GenTron: Diffusion Transformers for Image and Video Generation
by: Chen, Shoufa, et al.
Published: (2023) -
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving
by: Kong, Lingdong, et al.
Published: (2024) -
LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
by: Yang, Shuai, et al.
Published: (2024) -
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)