Saved in:
| Main Authors: | Liu, Ruiping, Zhang, Jingqi, Zheng, Junwei, Chen, Yufan, Lee, Peter Seungjune, Wen, Di, Peng, Kunyu, Zhang, Jiaming, Yang, Kailun, Mombaur, Katja, Stiefelhagen, Rainer |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.20121 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos
by: Liu, Ruiping, et al.
Published: (2026)
by: Liu, Ruiping, et al.
Published: (2026)
MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments
by: Zheng, Junwei, et al.
Published: (2023)
by: Zheng, Junwei, et al.
Published: (2023)
Skeleton-Based Human Action Recognition with Noisy Labels
by: Xu, Yi, et al.
Published: (2024)
by: Xu, Yi, et al.
Published: (2024)
Graph-based Document Structure Analysis
by: Chen, Yufan, et al.
Published: (2025)
by: Chen, Yufan, et al.
Published: (2025)
Open Panoramic Segmentation
by: Zheng, Junwei, et al.
Published: (2024)
by: Zheng, Junwei, et al.
Published: (2024)
Situat3DChange: Situated 3D Change Understanding Dataset for Multimodal Large Language Model
by: Liu, Ruiping, et al.
Published: (2025)
by: Liu, Ruiping, et al.
Published: (2025)
Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision
by: Wei, Yiping, et al.
Published: (2023)
by: Wei, Yiping, et al.
Published: (2023)
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
by: Chen, Yufan, et al.
Published: (2024)
by: Chen, Yufan, et al.
Published: (2024)
HybriDLA: Hybrid Generation for Document Layout Analysis
by: Chen, Yufan, et al.
Published: (2025)
by: Chen, Yufan, et al.
Published: (2025)
What if? Emulative Simulation with World Models for Situated Reasoning
by: Liu, Ruiping, et al.
Published: (2026)
by: Liu, Ruiping, et al.
Published: (2026)
Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
by: Liu, Ruiping, et al.
Published: (2024)
by: Liu, Ruiping, et al.
Published: (2024)
MICA: Multi-Agent Industrial Coordination Assistant
by: Wen, Di, et al.
Published: (2025)
by: Wen, Di, et al.
Published: (2025)
Towards Multi-Source Domain Generalization for Sleep Staging with Noisy Labels
by: Wang, Kening, et al.
Published: (2026)
by: Wang, Kening, et al.
Published: (2026)
Exploring Video-Based Driver Activity Recognition under Noisy Labels
by: Fan, Linjuan, et al.
Published: (2025)
by: Fan, Linjuan, et al.
Published: (2025)
Scene-agnostic Pose Regression for Visual Localization
by: Zheng, Junwei, et al.
Published: (2025)
by: Zheng, Junwei, et al.
Published: (2025)
Exploring Self-supervised Skeleton-based Action Recognition in Occluded Environments
by: Chen, Yifei, et al.
Published: (2023)
by: Chen, Yifei, et al.
Published: (2023)
RHO: Robust Holistic OSM-Based Metric Cross-View Geo-Localization
by: Zheng, Junwei, et al.
Published: (2026)
by: Zheng, Junwei, et al.
Published: (2026)
Referring Atomic Video Action Recognition
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection
by: Wen, Di, et al.
Published: (2025)
by: Wen, Di, et al.
Published: (2025)
Snap, Segment, Deploy: A Visual Data and Detection Pipeline for Wearable Industrial Assistants
by: Wen, Di, et al.
Published: (2025)
by: Wen, Di, et al.
Published: (2025)
SGR3 Model: Scene Graph Retrieval-Reasoning Model in 3D
by: Wang, Zirui, et al.
Published: (2026)
by: Wang, Zirui, et al.
Published: (2026)
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
by: Liu, Ruiping, et al.
Published: (2022)
by: Liu, Ruiping, et al.
Published: (2022)
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation
by: Teng, Fei, et al.
Published: (2023)
by: Teng, Fei, et al.
Published: (2023)
$M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs
by: Lin, Kaixin, et al.
Published: (2026)
by: Lin, Kaixin, et al.
Published: (2026)
DriveXQA: Cross-modal Visual Question Answering for Adverse Driving Scene Understanding
by: Tao, Mingzhe, et al.
Published: (2026)
by: Tao, Mingzhe, et al.
Published: (2026)
InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing
by: Yang, Yebin, et al.
Published: (2026)
by: Yang, Yebin, et al.
Published: (2026)
Comb, Prune, Distill: Towards Unified Pruning for Vision Model Compression
by: Schmitt, Jonas, et al.
Published: (2024)
by: Schmitt, Jonas, et al.
Published: (2024)
Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation
by: Luo, Yuanhao, et al.
Published: (2026)
by: Luo, Yuanhao, et al.
Published: (2026)
Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments
by: Wen, Di, et al.
Published: (2025)
by: Wen, Di, et al.
Published: (2025)
Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains
by: Peng, Kunyu, et al.
Published: (2023)
by: Peng, Kunyu, et al.
Published: (2023)
RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
Mitigating Label Noise using Prompt-Based Hyperbolic Meta-Learning in Open-Set Domain Generalization
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalization under Noisy Labels
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
by: Jiang, Xin, et al.
Published: (2024)
by: Jiang, Xin, et al.
Published: (2024)
Occlusion-Aware Seamless Segmentation
by: Cao, Yihong, et al.
Published: (2024)
by: Cao, Yihong, et al.
Published: (2024)
OneBEV: Using One Panoramic Image for Bird's-Eye-View Semantic Mapping
by: Wei, Jiale, et al.
Published: (2024)
by: Wei, Jiale, et al.
Published: (2024)
ObjectFinder: An Open-Vocabulary Assistive System for Interactive Object Search by Blind People
by: Liu, Ruiping, et al.
Published: (2024)
by: Liu, Ruiping, et al.
Published: (2024)
Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
CHAOS: Chart Analysis with Outlier Samples
by: Moured, Omar, et al.
Published: (2025)
by: Moured, Omar, et al.
Published: (2025)
Similar Items
-
EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos
by: Liu, Ruiping, et al.
Published: (2026) -
MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments
by: Zheng, Junwei, et al.
Published: (2023) -
Skeleton-Based Human Action Recognition with Noisy Labels
by: Xu, Yi, et al.
Published: (2024) -
Graph-based Document Structure Analysis
by: Chen, Yufan, et al.
Published: (2025) -
Open Panoramic Segmentation
by: Zheng, Junwei, et al.
Published: (2024)