Saved in:
| Main Authors: | Zhou, Donghao, Liu, Guisheng, Yang, Hao, Li, Jiatong, Lin, Jingyu, Huang, Xiaohu, Liu, Yichen, Gao, Xin, Chen, Cunjian, Wen, Shilei, Fu, Chi-Wing, Heng, Pheng-Ann |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.11804 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
by: Liu, Yichen, et al.
Published: (2026)
by: Liu, Yichen, et al.
Published: (2026)
IdentityStory: Taming Your Identity-Preserving Generator for Human-Centric Story Generation
by: Zhou, Donghao, et al.
Published: (2025)
by: Zhou, Donghao, et al.
Published: (2025)
UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation
by: Wang, Yinqiao, et al.
Published: (2025)
by: Wang, Yinqiao, et al.
Published: (2025)
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
by: Song, Quanjian, et al.
Published: (2025)
by: Song, Quanjian, et al.
Published: (2025)
DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis
by: Gao, Jialin, et al.
Published: (2025)
by: Gao, Jialin, et al.
Published: (2025)
Coordinated 2D-3D Visualization of Volumetric Medical Data in XR with Multimodal Interactions
by: Liu, Qixuan, et al.
Published: (2025)
by: Liu, Qixuan, et al.
Published: (2025)
Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era
by: Hu, Xiaowei, et al.
Published: (2024)
by: Hu, Xiaowei, et al.
Published: (2024)
SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View Adaptation
by: Wang, Yinqiao, et al.
Published: (2024)
by: Wang, Yinqiao, et al.
Published: (2024)
CvhSlicer 2.0: Immersive and Interactive Visualization of Chinese Visible Human Data in XR Environments
by: Qiu, Yue, et al.
Published: (2025)
by: Qiu, Yue, et al.
Published: (2025)
Video Instance Shadow Detection Under the Sun and Sky
by: Xing, Zhenghao, et al.
Published: (2022)
by: Xing, Zhenghao, et al.
Published: (2022)
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning
by: Xing, Zhenghao, et al.
Published: (2025)
by: Xing, Zhenghao, et al.
Published: (2025)
Overcoming Support Dilution for Robust Few-shot Semantic Segmentation
by: Tang, Wailing, et al.
Published: (2025)
by: Tang, Wailing, et al.
Published: (2025)
JoVA: Unified Multimodal Learning for Joint Video-Audio Generation
by: Huang, Xiaohu, et al.
Published: (2025)
by: Huang, Xiaohu, et al.
Published: (2025)
Hand-Shadow Poser
by: Xu, Hao, et al.
Published: (2025)
by: Xu, Hao, et al.
Published: (2025)
OPA-Pack: Object-Property-Aware Robotic Bin Packing
by: Pan, Jia-Hui, et al.
Published: (2025)
by: Pan, Jia-Hui, et al.
Published: (2025)
Towards Real-World Adverse Weather Image Restoration: Enhancing Clearness and Semantics with Vision-Language Models
by: Xu, Jiaqi, et al.
Published: (2024)
by: Xu, Jiaqi, et al.
Published: (2024)
Revisiting Shadow Detection: A New Benchmark Dataset for Complex World
by: Hu, Xiaowei, et al.
Published: (2019)
by: Hu, Xiaowei, et al.
Published: (2019)
Unifying Physically-Informed Weather Priors in A Single Model for Image Restoration Across Multiple Adverse Weather Conditions
by: Xu, Jiaqi, et al.
Published: (2026)
by: Xu, Jiaqi, et al.
Published: (2026)
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
by: Zhu, Runsong, et al.
Published: (2025)
by: Zhu, Runsong, et al.
Published: (2025)
Fast-in-Slow: A Dual-System Foundation Model Unifying Fast Manipulation within Slow Reasoning
by: Chen, Hao, et al.
Published: (2025)
by: Chen, Hao, et al.
Published: (2025)
PCF-Lift: Panoptic Lifting by Probabilistic Contrastive Fusion
by: Zhu, Runsong, et al.
Published: (2024)
by: Zhu, Runsong, et al.
Published: (2024)
HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object Interactions
by: Xu, Hao, et al.
Published: (2024)
by: Xu, Hao, et al.
Published: (2024)
HERO: Hierarchical Extrapolation and Refresh for Efficient World Models
by: Song, Quanjian, et al.
Published: (2025)
by: Song, Quanjian, et al.
Published: (2025)
Deep Omni-supervised Learning for Rib Fracture Detection from Chest Radiology Images
by: Chai, Zhizhong, et al.
Published: (2023)
by: Chai, Zhizhong, et al.
Published: (2023)
Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making
by: Wang, Yihan, et al.
Published: (2025)
by: Wang, Yihan, et al.
Published: (2025)
Tele-Omni: a Unified Multimodal Framework for Video Generation and Editing
by: Liu, Jialun, et al.
Published: (2026)
by: Liu, Jialun, et al.
Published: (2026)
COS3D: Collaborative Open-Vocabulary 3D Segmentation
by: Zhu, Runsong, et al.
Published: (2025)
by: Zhu, Runsong, et al.
Published: (2025)
Distribution-Aware Calibration for Object Detection with Noisy Bounding Boxes
by: Zhou, Donghao, et al.
Published: (2023)
by: Zhou, Donghao, et al.
Published: (2023)
A Collaborative Extended Reality Prototype for 3D Surgical Planning and Visualization
by: Qiu, Shi, et al.
Published: (2026)
by: Qiu, Shi, et al.
Published: (2026)
RiboSphere: Learning Unified and Efficient Representations of RNA Structures
by: Zhang, Zhou, et al.
Published: (2026)
by: Zhang, Zhou, et al.
Published: (2026)
MedHallTune: An Instruction-Tuning Benchmark for Mitigating Medical Hallucination in Vision-Language Models
by: Yan, Qiao, et al.
Published: (2025)
by: Yan, Qiao, et al.
Published: (2025)
EgoHandICL: Egocentric 3D Hand Reconstruction with In-Context Learning
by: Xie, Binzhu, et al.
Published: (2026)
by: Xie, Binzhu, et al.
Published: (2026)
Point Cloud Understanding via Attention-Driven Contrastive Learning
by: Wang, Yi, et al.
Published: (2024)
by: Wang, Yi, et al.
Published: (2024)
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
by: Wang, Jiaze, et al.
Published: (2024)
by: Wang, Jiaze, et al.
Published: (2024)
CHOIR: Contact-aware 4D Hand-Object Interaction Reconstruction
by: Xu, Hao, et al.
Published: (2026)
by: Xu, Hao, et al.
Published: (2026)
Category Query Learning for Human-Object Interaction Classification
by: Xie, Chi, et al.
Published: (2023)
by: Xie, Chi, et al.
Published: (2023)
OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding
by: Yao, Jiali, et al.
Published: (2025)
by: Yao, Jiali, et al.
Published: (2025)
Adapting Segment Anything Model for Unseen Object Instance Segmentation
by: Cao, Rui, et al.
Published: (2024)
by: Cao, Rui, et al.
Published: (2024)
Rethinking Intermediate Representation for VLM-based Robot Manipulation
by: Tang, Weiliang, et al.
Published: (2025)
by: Tang, Weiliang, et al.
Published: (2025)
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception
by: Ma, Ziyang, et al.
Published: (2025)
by: Ma, Ziyang, et al.
Published: (2025)
Similar Items
-
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
by: Liu, Yichen, et al.
Published: (2026) -
IdentityStory: Taming Your Identity-Preserving Generator for Human-Centric Story Generation
by: Zhou, Donghao, et al.
Published: (2025) -
UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation
by: Wang, Yinqiao, et al.
Published: (2025) -
SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency
by: Song, Quanjian, et al.
Published: (2025) -
DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis
by: Gao, Jialin, et al.
Published: (2025)