Saved in:
| Main Authors: | Xiang, Tong, Zhao, Hongxia, Zhu, Fenghua, Chen, Yuanyuan, Lv, Yisheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.13823 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
OpenCOOD-Air: Prompting Heterogeneous Ground-Air Collaborative Perception with Spatial Conversion and Offset Prediction
by: Wu, Xianke, et al.
Published: (2026)
by: Wu, Xianke, et al.
Published: (2026)
CogRail: Benchmarking VLMs in Cognitive Intrusion Perception for Intelligent Railway Transportation Systems
by: Tian, Yonglin, et al.
Published: (2026)
by: Tian, Yonglin, et al.
Published: (2026)
RoadSceneVQA: Benchmarking Visual Question Answering in Roadside Perception Systems for Intelligent Transportation System
by: Guan, Runwei, et al.
Published: (2025)
by: Guan, Runwei, et al.
Published: (2025)
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots
by: Liu, Minghuan, et al.
Published: (2025)
by: Liu, Minghuan, et al.
Published: (2025)
CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception
by: Chen, Gong, et al.
Published: (2026)
by: Chen, Gong, et al.
Published: (2026)
Med-Scout: Curing MLLMs' Geometric Blindness in Medical Perception via Geometry-Aware RL Post-Training
by: Liu, Anglin, et al.
Published: (2026)
by: Liu, Anglin, et al.
Published: (2026)
MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering
by: Tian, Yonglin, et al.
Published: (2024)
by: Tian, Yonglin, et al.
Published: (2024)
SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models
by: Yue, Tongtian, et al.
Published: (2024)
by: Yue, Tongtian, et al.
Published: (2024)
DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details
by: Wang, Ruicheng, et al.
Published: (2025)
by: Wang, Ruicheng, et al.
Published: (2025)
Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering
by: Chen, Xiang, et al.
Published: (2024)
by: Chen, Xiang, et al.
Published: (2024)
Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness
by: Zhao, Jiaxing, et al.
Published: (2025)
by: Zhao, Jiaxing, et al.
Published: (2025)
Split-Fuse-Transport: Annotation-Free Saliency via Dual Clustering and Optimal Transport Alignment
by: Ramzan, Muhammad Umer, et al.
Published: (2025)
by: Ramzan, Muhammad Umer, et al.
Published: (2025)
Computer Vision-Driven Gesture Recognition: Toward Natural and Intuitive Human-Computer
by: Shao, Fenghua, et al.
Published: (2024)
by: Shao, Fenghua, et al.
Published: (2024)
Gating Syn-to-Real Knowledge for Pedestrian Crossing Prediction in Safe Driving
by: Bai, Jie, et al.
Published: (2024)
by: Bai, Jie, et al.
Published: (2024)
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving
by: Zhang, Enming, et al.
Published: (2024)
by: Zhang, Enming, et al.
Published: (2024)
Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs
by: Jeziorek, Kamil, et al.
Published: (2025)
by: Jeziorek, Kamil, et al.
Published: (2025)
Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models
by: Weng, Fenghua, et al.
Published: (2025)
by: Weng, Fenghua, et al.
Published: (2025)
Enabling Intelligent Traffic Systems: A Deep Learning Method for Accurate Arabic License Plate Recognition
by: Sayedelahl, M. A.
Published: (2024)
by: Sayedelahl, M. A.
Published: (2024)
Self-Supervised Alignment Learning for Medical Image Segmentation
by: Li, Haofeng, et al.
Published: (2024)
by: Li, Haofeng, et al.
Published: (2024)
Hierarchical Self-Prompting SAM: A Prompt-Free Medical Image Segmentation Framework
by: Zhang, Mengmeng, et al.
Published: (2025)
by: Zhang, Mengmeng, et al.
Published: (2025)
AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving
by: Zhang, Ruifei, et al.
Published: (2025)
by: Zhang, Ruifei, et al.
Published: (2025)
KPLM-STA: Physically-Accurate Shadow Synthesis for Human Relighting via Keypoint-Based Light Modeling
by: Yin, Xinhui, et al.
Published: (2025)
by: Yin, Xinhui, et al.
Published: (2025)
GRAM-MAMBA: Holistic Feature Alignment for Wireless Perception with Adaptive Low-Rank Compensation
by: Yang, Weiqi, et al.
Published: (2025)
by: Yang, Weiqi, et al.
Published: (2025)
All-in-One Transferring Image Compression from Human Perception to Multi-Machine Perception
by: Zhao, Jiancheng, et al.
Published: (2025)
by: Zhao, Jiancheng, et al.
Published: (2025)
Reconstructing Building Height from Spaceborne TomoSAR Point Clouds Using a Dual-Topology Network
by: Chen, Zhaiyu, et al.
Published: (2026)
by: Chen, Zhaiyu, et al.
Published: (2026)
Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample
by: Shao, Zhiwen, et al.
Published: (2024)
by: Shao, Zhiwen, et al.
Published: (2024)
RT-DATR: Real-time Unsupervised Domain Adaptive Detection Transformer with Adversarial Feature Alignment
by: Lv, Feng, et al.
Published: (2025)
by: Lv, Feng, et al.
Published: (2025)
MeshLAM: Feed-Forward One-Shot Animatable Textured Mesh Avatar Reconstruction
by: He, Yisheng, et al.
Published: (2026)
by: He, Yisheng, et al.
Published: (2026)
UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception
by: Song, Xinyang, et al.
Published: (2025)
by: Song, Xinyang, et al.
Published: (2025)
GVGS: Gaussian Visibility-Aware Multi-View Geometry for Accurate Surface Reconstruction
by: Su, Mai, et al.
Published: (2026)
by: Su, Mai, et al.
Published: (2026)
Self-Localized Collaborative Perception
by: Ni, Zhenyang, et al.
Published: (2024)
by: Ni, Zhenyang, et al.
Published: (2024)
EdgePoint2: Compact Descriptors for Superior Efficiency and Accuracy
by: Yao, Haodi, et al.
Published: (2025)
by: Yao, Haodi, et al.
Published: (2025)
Pre-training CLIP against Data Poisoning with Optimal Transport-based Matching and Alignment
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
Degradation-Aware Residual-Conditioned Optimal Transport for Unified Image Restoration
by: Tang, Xiaole, et al.
Published: (2024)
by: Tang, Xiaole, et al.
Published: (2024)
SARL: Spatially-Aware Self-Supervised Representation Learning for Visuo-Tactile Perception
by: Khurana, Gurmeher, et al.
Published: (2025)
by: Khurana, Gurmeher, et al.
Published: (2025)
Enabling Fast and Accurate Crowdsourced Annotation for Elevation-Aware Flood Extent Mapping
by: Dyken, Landon, et al.
Published: (2024)
by: Dyken, Landon, et al.
Published: (2024)
FeaKM: Robust Collaborative Perception under Noisy Pose Conditions
by: Hao, Jiuwu, et al.
Published: (2025)
by: Hao, Jiuwu, et al.
Published: (2025)
DA-Mamba: Learning Domain-Aware State Space Model for Global-Local Alignment in Domain Adaptive Object Detection
by: Li, Haochen, et al.
Published: (2026)
by: Li, Haochen, et al.
Published: (2026)
MOGeo: Beyond One-to-One Cross-View Object Geo-localization
by: Lv, Bo, et al.
Published: (2026)
by: Lv, Bo, et al.
Published: (2026)
Similar Items
-
OpenCOOD-Air: Prompting Heterogeneous Ground-Air Collaborative Perception with Spatial Conversion and Offset Prediction
by: Wu, Xianke, et al.
Published: (2026) -
CogRail: Benchmarking VLMs in Cognitive Intrusion Perception for Intelligent Railway Transportation Systems
by: Tian, Yonglin, et al.
Published: (2026) -
RoadSceneVQA: Benchmarking Visual Question Answering in Roadside Perception Systems for Intelligent Transportation System
by: Guan, Runwei, et al.
Published: (2025) -
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots
by: Liu, Minghuan, et al.
Published: (2025) -
CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception
by: Chen, Gong, et al.
Published: (2026)