:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Ruiping, Zhang, Jingqi, Zheng, Junwei, Chen, Yufan, Lee, Peter Seungjune, Wen, Di, Peng, Kunyu, Zhang, Jiaming, Yang, Kailun, Mombaur, Katja, Stiefelhagen, Rainer
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2603.20121
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos
by: Liu, Ruiping, et al.
Published: (2026)

MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments
by: Zheng, Junwei, et al.
Published: (2023)

Skeleton-Based Human Action Recognition with Noisy Labels
by: Xu, Yi, et al.
Published: (2024)

Graph-based Document Structure Analysis
by: Chen, Yufan, et al.
Published: (2025)

Open Panoramic Segmentation
by: Zheng, Junwei, et al.
Published: (2024)

Situat3DChange: Situated 3D Change Understanding Dataset for Multimodal Large Language Model
by: Liu, Ruiping, et al.
Published: (2025)

Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision
by: Wei, Yiping, et al.
Published: (2023)

RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
by: Chen, Yufan, et al.
Published: (2024)

HybriDLA: Hybrid Generation for Document Layout Analysis
by: Chen, Yufan, et al.
Published: (2025)

What if? Emulative Simulation with World Models for Situated Reasoning
by: Liu, Ruiping, et al.
Published: (2026)

Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
by: Liu, Ruiping, et al.
Published: (2024)

MICA: Multi-Agent Industrial Coordination Assistant
by: Wen, Di, et al.
Published: (2025)

Towards Multi-Source Domain Generalization for Sleep Staging with Noisy Labels
by: Wang, Kening, et al.
Published: (2026)

Exploring Video-Based Driver Activity Recognition under Noisy Labels
by: Fan, Linjuan, et al.
Published: (2025)

Scene-agnostic Pose Regression for Visual Localization
by: Zheng, Junwei, et al.
Published: (2025)

Exploring Self-supervised Skeleton-based Action Recognition in Occluded Environments
by: Chen, Yifei, et al.
Published: (2023)

RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization
by: Zheng, Junwei, et al.
Published: (2026)

Referring Atomic Video Action Recognition
by: Peng, Kunyu, et al.
Published: (2024)

RoHOI: Robustness Benchmark for Human-Object Interaction Detection
by: Wen, Di, et al.
Published: (2025)

Snap, Segment, Deploy: A Visual Data and Detection Pipeline for Wearable Industrial Assistants
by: Wen, Di, et al.
Published: (2025)

SGR3 Model: Scene Graph Retrieval-Reasoning Model in 3D
by: Wang, Zirui, et al.
Published: (2026)

TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
by: Liu, Ruiping, et al.
Published: (2022)

HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
by: Peng, Kunyu, et al.
Published: (2025)

OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation
by: Teng, Fei, et al.
Published: (2023)

$M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs
by: Lin, Kaixin, et al.
Published: (2026)

DriveXQA: Cross-modal Visual Question Answering for Adverse Driving Scene Understanding
by: Tao, Mingzhe, et al.
Published: (2026)

InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing
by: Yang, Yebin, et al.
Published: (2026)

Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression
by: Schmitt, Jonas, et al.
Published: (2024)

Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation
by: Luo, Yuanhao, et al.
Published: (2026)

Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments
by: Wen, Di, et al.
Published: (2025)

Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains
by: Peng, Kunyu, et al.
Published: (2023)

RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba
by: Peng, Kunyu, et al.
Published: (2025)

Mitigating Label Noise using Prompt-Based Hyperbolic Meta-Learning in Open-Set Domain Generalization
by: Peng, Kunyu, et al.
Published: (2024)

EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalization under Noisy Labels
by: Peng, Kunyu, et al.
Published: (2025)

@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
by: Jiang, Xin, et al.
Published: (2024)

Occlusion-Aware Seamless Segmentation
by: Cao, Yihong, et al.
Published: (2024)

OneBEV: Using One Panoramic Image for Bird's-Eye-View Semantic Mapping
by: Wei, Jiale, et al.
Published: (2024)

ObjectFinder: An Open-Vocabulary Assistive System for Interactive Object Search by Blind People
by: Liu, Ruiping, et al.
Published: (2024)

Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler
by: Peng, Kunyu, et al.
Published: (2024)

CHAOS: Chart Analysis with Outlier Samples
by: Moured, Omar, et al.
Published: (2025)