:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Haitian, Wang, Yiren, Wang, Xinyu, Fung, Sheldon, Mansoor, Atif
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.17069
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

P2MFDS: A Privacy-Preserving Multimodal Fall Detection System for Elderly People in Bathroom Environments
by: Wang, Haitian, et al.
Published: (2025)

Quantization-Aware Neuromorphic Architecture for Skin Disease Classification on Resource-Constrained Devices
by: Wang, Haitian, et al.
Published: (2025)

Geo-Registration of Terrestrial LiDAR Point Clouds with Satellite Images without GNSS
by: Wang, Xinyu, et al.
Published: (2025)

Automated Road Extraction and Centreline Fitting in LiDAR Point Clouds
by: Wang, Xinyu, et al.
Published: (2025)

Multispectral Remote Sensing for Weed Detection in West Australian Agricultural Lands
by: Wang, Haitian, et al.
Published: (2025)

BAWSeg: A UAV Multispectral Benchmark for Barley Weed Segmentation
by: Wang, Haitian, et al.
Published: (2026)

OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation
by: Song, Yiren, et al.
Published: (2026)

Unlocking the Latent Canvas: Eliciting and Benchmarking Symbolic Visual Expression in LLMs
by: Zheng, Yiren, et al.
Published: (2026)

Subpixel Edge Localization Based on Converted Intensity Summation under Stable Edge Region
by: Yang, Yingyuan, et al.
Published: (2025)

EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer
by: Zhang, Yuxuan, et al.
Published: (2025)

StreamingEffect: Real-Time Human-Centric Video Effect Generation
by: Song, Yiren, et al.
Published: (2026)

Active Measurement of Two-Point Correlations
by: Hamilton, Max, et al.
Published: (2026)

Detecting Every Object from Events
by: Zhang, Haitian, et al.
Published: (2024)

TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals
by: Vedernikov, Alexander, et al.
Published: (2024)

Codebook-Based Adaptive Feature Compression With Semantic Enhancement for Edge-Cloud Systems
by: Wang, Xinyu, et al.
Published: (2025)

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices
by: HyperAI Team, et al.
Published: (2025)

Multimodal Priors-Augmented Text-Driven 3D Human-Object Interaction Generation
by: Wang, Yin, et al.
Published: (2026)

LiDAR-based 3D Change Detection at City Scale
by: Albagami, Hezam, et al.
Published: (2025)

Multistream Network for LiDAR and Camera-based 3D Object Detection in Outdoor Scenes
by: Ibrahim, Muhammad, et al.
Published: (2025)

Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment
by: Wu, Jiaqi, et al.
Published: (2024)

Rethinking the Architecture Design for Efficient Generic Event Boundary Detection
by: Zheng, Ziwei, et al.
Published: (2024)

DualStreamFoveaNet: A Dual Stream Fusion Architecture with Anatomical Awareness for Robust Fovea Localization
by: Song, Sifan, et al.
Published: (2023)

Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations
by: Fung, Angus, et al.
Published: (2022)

Data-Efficient Stream-Based Active Distillation for Scalable Edge Model Deployment
by: Manjah, Dani, et al.
Published: (2025)

DAGE: Dual-Stream Architecture for Efficient and Fine-Grained Geometry Estimation
by: Ngo, Tuan Duc, et al.
Published: (2026)

Task-Driven Fixation Network: An Efficient Architecture with Fixation Selection
by: Wang, Shuguang, et al.
Published: (2025)

SIGMA: Selective-Interleaved Generation with Multi-Attribute Tokens
by: Zhang, Xiaoyan, et al.
Published: (2026)

Multi-Representation Adapter with Neural Architecture Search for Efficient Range-Doppler Radar Object Detection
by: Lin, Zhiwei, et al.
Published: (2025)

Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
by: Jiang, Jianping, et al.
Published: (2024)

MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
by: Wu, Zonglin, et al.
Published: (2025)

StreamingCoT: A Dataset for Temporal Dynamics and Multimodal Chain-of-Thought Reasoning in Streaming VideoQA
by: Hu, Yuhang, et al.
Published: (2025)

AnxietyFaceTrack: A Smartphone-Based Non-Intrusive Approach for Detecting Social Anxiety Using Facial Features
by: Sahu, Nilesh Kumar, et al.
Published: (2025)

Can Large Vision-Language Models Understand Multimodal Sarcasm?
by: Wang, Xinyu, et al.
Published: (2025)

NeXt2Former-CD: Efficient Remote Sensing Change Detection with Modern Vision Architectures
by: Wang, Yufan, et al.
Published: (2026)

EMCompress: Video-LLMs with Endomorphic Multimodal Compression
by: Fan, Zheyu, et al.
Published: (2025)

StreamingTOM: Streaming Token Compression for Efficient Video Understanding
by: Chen, Xueyi, et al.
Published: (2025)

An Improved 3D Skeletons UP-Fall Dataset: Enhancing Data Quality for Efficient Impact Fall Detection
by: Koffi, Tresor Y., et al.
Published: (2025)

AViLA: Asynchronous Vision-Language Agent for Streaming Multimodal Data Interaction
by: Zhang, Gengyuan, et al.
Published: (2025)

ModaVerse: Efficiently Transforming Modalities with LLMs
by: Wang, Xinyu, et al.
Published: (2024)

LiNeXt: Revisiting LiDAR Completion with Efficient Non-Diffusion Architectures
by: He, Wenzhe, et al.
Published: (2025)