:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wan, Zhaoliang, Ling, Yonggen, Yi, Senlin, Qi, Lu, Lee, Wangwei, Lu, Minglei, Yang, Sicheng, Teng, Xiao, Lu, Peng, Yang, Xu, Yang, Ming-Hsuan, Cheng, Hui
Format:	Preprint
Published:	2024
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2501.00510
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
by: Zhang, Chuanrui, et al.
Published: (2024)

Training-free score-based diffusion for parameter-dependent stochastic dynamical systems
by: Yang, Minglei, et al.
Published: (2026)

RAPID Hand: A Robust, Affordable, Perception-Integrated, Dexterous Manipulation Platform for Generalist Robot Autonomy
by: Wan, Zhaoliang, et al.
Published: (2025)

One Flight Over the Gap: A Survey from Perspective to Panoramic Vision
by: Lin, Xin, et al.
Published: (2025)

When would Vision-Proprioception Policies Fail in Robotic Manipulation?
by: Lu, Jingxian, et al.
Published: (2026)

STRNet: Visual Navigation with Spatio-Temporal Representation through Dynamic Graph Aggregation
by: Ren, Hao, et al.
Published: (2026)

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
by: Zhou, Junwei, et al.
Published: (2024)

CoCo4D: Comprehensive and Complex 4D Scene Generation
by: Zhou, Junwei, et al.
Published: (2025)

Spatial-Temporal Multi-level Association for Video Object Segmentation
by: Miao, Deshui, et al.
Published: (2024)

LaVin-DiT: Large Vision Diffusion Transformer
by: Wang, Zhaoqing, et al.
Published: (2024)

Territoires du Vin
Published: (2021)

Vin et altérité
Published: (2022)

Beyond Boundaries: Leveraging Vision Foundation Models for Source-Free Object Detection
by: Yao, Huizai, et al.
Published: (2025)

Scaling Video Pretraining for Surgical Foundation Models
by: Lu, Sicheng, et al.
Published: (2026)

Restage4D: Reanimating Deformable 3D Reconstruction from a Single Video
by: He, Jixuan, et al.
Published: (2025)

Touch100k: A Large-Scale Touch-Language-Vision Dataset for Touch-Centric Multimodal Representation
by: Cheng, Ning, et al.
Published: (2024)

Learning Spatial-Semantic Features for Robust Video Object Segmentation
by: Li, Xin, et al.
Published: (2024)

UniPR: Unified Object-level Real-to-Sim Perception and Reconstruction from a Single Stereo Pair
by: Zhang, Chuanrui, et al.
Published: (2026)

Conditional Pseudo-Reversible Normalizing Flow for Surrogate Modeling in Quantifying Uncertainty Propagation
by: Yang, Minglei, et al.
Published: (2024)

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal
by: Wang, Tao, et al.
Published: (2024)

Unified Dense Prediction of Video Diffusion
by: Yang, Lehan, et al.
Published: (2025)

Video Prediction Transformers without Recurrence or Convolution
by: Tang, Yujin, et al.
Published: (2024)

Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model
by: Liu, Chengxu, et al.
Published: (2025)

CSL: Class-Agnostic Structure-Constrained Learning for Segmentation Including the Unseen
by: Zhang, Hao, et al.
Published: (2023)

Frequency Domain-Based Diffusion Model for Unpaired Image Dehazing
by: Liu, Chengxu, et al.
Published: (2025)

Fly360: Omnidirectional Obstacle Avoidance within Drone View
by: Zhang, Xiangkai, et al.
Published: (2026)

Fisher-Preserving Guidance: Training-Free Manifold Constraints for Safe Diffusion Control
by: Ren, Hao, et al.
Published: (2026)

Hierarchical Audio-Visual-Proprioceptive Fusion for Precise Robotic Manipulation
by: Li, Siyuan, et al.
Published: (2026)

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection
by: Yue, Jingtong, et al.
Published: (2025)

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
by: Huang, Kuan-Chih, et al.
Published: (2024)

Training Class-Imbalanced Diffusion Model Via Overlap Optimization
by: Yan, Divin, et al.
Published: (2024)

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale
by: Zuo, Sicheng, et al.
Published: (2026)

Training Dataset for CNXT-Ti-LT Landslide Susceptibility Mapping
by: Luo, Senlin, et al.
Published: (2026)

AnyRotate: Gravity-Invariant In-Hand Object Rotation with Sim-to-Real Touch
by: Yang, Max, et al.
Published: (2024)

Bioactive Feed Additives in Poultry Reproductive Physiology
by: Wenjie Lu, et al.
Published: (2025)

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance
by: Huang, Kuan-Chih, et al.
Published: (2023)

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
by: Wang, Chaoyang, et al.
Published: (2024)

UMC: Unified Resilient Controller for Legged Robots with Joint Malfunctions
by: Qiu, Yu, et al.
Published: (2025)

Pyramid Diffusion for Fine 3D Large Scene Generation
by: Liu, Yuheng, et al.
Published: (2023)

LLAVADI: What Matters For Multimodal Large Language Models Distillation
by: Xu, Shilin, et al.
Published: (2024)