Saved in:
| Main Author: | Wu, Xiaoran |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.14553 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HaDR: Applying Domain Randomization for Generating Synthetic Multimodal Dataset for Hand Instance Segmentation in Cluttered Industrial Environments
by: Grushko, Stefan, et al.
Published: (2023)
by: Grushko, Stefan, et al.
Published: (2023)
Videogenic: Identifying Highlight Moments in Videos with Professional Photographs as a Prior
by: Lin, David Chuan-En, et al.
Published: (2022)
by: Lin, David Chuan-En, et al.
Published: (2022)
How to Distinguish AI-Generated Images from Authentic Photographs
by: Kamali, Negar, et al.
Published: (2024)
by: Kamali, Negar, et al.
Published: (2024)
PoseDriver: A Unified Approach to Multi-Category Skeleton Detection for Autonomous Driving
by: Borhani, Yasamin, et al.
Published: (2026)
by: Borhani, Yasamin, et al.
Published: (2026)
ChildCI Framework: Analysis of Motor and Cognitive Development in Children-Computer Interaction for Age Detection
by: Ruiz-Garcia, Juan Carlos, et al.
Published: (2022)
by: Ruiz-Garcia, Juan Carlos, et al.
Published: (2022)
Accessible, At-Home Detection of Parkinson's Disease via Multi-task Video Analysis
by: Islam, Md Saiful, et al.
Published: (2024)
by: Islam, Md Saiful, et al.
Published: (2024)
Towards Consumer-Grade Cybersickness Prediction: Multi-Model Alignment for Real-Time Vision-Only Inference
by: Zhu, Yitong, et al.
Published: (2025)
by: Zhu, Yitong, et al.
Published: (2025)
Effective Guidance for Model Attention with Simple Yes-no Annotations
by: Lee, Seongmin, et al.
Published: (2024)
by: Lee, Seongmin, et al.
Published: (2024)
Beyond Questionnaires: Video Analysis for Social Anxiety Detection
by: Sahu, Nilesh Kumar, et al.
Published: (2024)
by: Sahu, Nilesh Kumar, et al.
Published: (2024)
VoxelKeypointFusion: Generalizable Multi-View Multi-Person Pose Estimation
by: Bermuth, Daniel, et al.
Published: (2024)
by: Bermuth, Daniel, et al.
Published: (2024)
A Survey on Drowsiness Detection -- Modern Applications and Methods
by: Fu, Biying, et al.
Published: (2024)
by: Fu, Biying, et al.
Published: (2024)
YOLOA: Real-Time Affordance Detection via LLM Adapter
by: Ji, Yuqi, et al.
Published: (2025)
by: Ji, Yuqi, et al.
Published: (2025)
Do Object Detection Localization Errors Affect Human Performance and Trust?
by: de Witte, Sven, et al.
Published: (2024)
by: de Witte, Sven, et al.
Published: (2024)
Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration
by: Meyer, Louie Søs, et al.
Published: (2024)
by: Meyer, Louie Søs, et al.
Published: (2024)
AR-Facilitated Safety Inspection and Fall Hazard Detection on Construction Sites
by: Liu, Jiazhou, et al.
Published: (2024)
by: Liu, Jiazhou, et al.
Published: (2024)
Machine Learning-Based Jamun Leaf Disease Detection: A Comprehensive Review
by: Bhowmik, Auvick Chandra, et al.
Published: (2023)
by: Bhowmik, Auvick Chandra, et al.
Published: (2023)
Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision
by: Long-fei, Chen, et al.
Published: (2019)
by: Long-fei, Chen, et al.
Published: (2019)
mEBAL2 Database and Benchmark: Image-based Multispectral Eyeblink Detection
by: Daza, Roberto, et al.
Published: (2023)
by: Daza, Roberto, et al.
Published: (2023)
VFA: Vision Frequency Analysis of Foundation Models and Human
by: Darvishi-Bayazi, Mohammad-Javad, et al.
Published: (2024)
by: Darvishi-Bayazi, Mohammad-Javad, et al.
Published: (2024)
QueryCraft: Transformer-Guided Query Initialization for Enhanced Human-Object Interaction Detection
by: Wang, Yuxiao, et al.
Published: (2025)
by: Wang, Yuxiao, et al.
Published: (2025)
Biometrics and Behavior Analysis for Detecting Distractions in e-Learning
by: Becerra, Álvaro, et al.
Published: (2024)
by: Becerra, Álvaro, et al.
Published: (2024)
HarassGuard: Detecting Harassment Behaviors in Social Virtual Reality with Vision-Language Models
by: Lee, Junhee, et al.
Published: (2026)
by: Lee, Junhee, et al.
Published: (2026)
mEBAL: A Multimodal Database for Eye Blink Detection and Attention Level Estimation
by: Daza, Roberto, et al.
Published: (2020)
by: Daza, Roberto, et al.
Published: (2020)
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Exploring Thermography Technology: A Comprehensive Facial Dataset for Face Detection, Recognition, and Emotion
by: Abuhussein, Mohamed Fawzi Abdelshafie, et al.
Published: (2024)
by: Abuhussein, Mohamed Fawzi Abdelshafie, et al.
Published: (2024)
Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding
by: Guo, Hao, et al.
Published: (2025)
by: Guo, Hao, et al.
Published: (2025)
Designing Multi-Robot Ground Video Sensemaking with Public Safety Professionals
by: Zhou, Puqi, et al.
Published: (2026)
by: Zhou, Puqi, et al.
Published: (2026)
Enhancing Apparent Personality Trait Analysis with Cross-Modal Embeddings
by: Fodor, Ádám, et al.
Published: (2024)
by: Fodor, Ádám, et al.
Published: (2024)
AI-Enhanced Virtual Reality in Medicine: A Comprehensive Survey
by: Wu, Yixuan, et al.
Published: (2024)
by: Wu, Yixuan, et al.
Published: (2024)
CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors
by: Marquez-Carpintero, Luis, et al.
Published: (2025)
by: Marquez-Carpintero, Luis, et al.
Published: (2025)
A Comparison of Bounding Box and Landmark Detection Methods for Video-Based Heart Rate Estimation
by: Liang, Laurence
Published: (2023)
by: Liang, Laurence
Published: (2023)
MAGE: A Multi-task Architecture for Gaze Estimation with an Efficient Calibration Module
by: Huang, Haoming, et al.
Published: (2025)
by: Huang, Haoming, et al.
Published: (2025)
Development of a Mobile Application for at-Home Analysis of Retinal Fundus Images
by: Reid, Mattea, et al.
Published: (2025)
by: Reid, Mattea, et al.
Published: (2025)
Detecting Activities of Daily Living in Egocentric Video to Contextualize Hand Use at Home in Outpatient Neurorehabilitation Settings
by: Kadambi, Adesh, et al.
Published: (2024)
by: Kadambi, Adesh, et al.
Published: (2024)
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
by: Li, Zixing, et al.
Published: (2024)
by: Li, Zixing, et al.
Published: (2024)
PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control
by: Swami, Kunal, et al.
Published: (2025)
by: Swami, Kunal, et al.
Published: (2025)
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition
by: Garg, Mallika, et al.
Published: (2025)
by: Garg, Mallika, et al.
Published: (2025)
AltChart: Enhancing VLM-based Chart Summarization Through Multi-Pretext Tasks
by: Moured, Omar, et al.
Published: (2024)
by: Moured, Omar, et al.
Published: (2024)
AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation
by: Yue, Yuanwen, et al.
Published: (2023)
by: Yue, Yuanwen, et al.
Published: (2023)
A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces
by: Kenely, Matthew, et al.
Published: (2025)
by: Kenely, Matthew, et al.
Published: (2025)
Similar Items
-
HaDR: Applying Domain Randomization for Generating Synthetic Multimodal Dataset for Hand Instance Segmentation in Cluttered Industrial Environments
by: Grushko, Stefan, et al.
Published: (2023) -
Videogenic: Identifying Highlight Moments in Videos with Professional Photographs as a Prior
by: Lin, David Chuan-En, et al.
Published: (2022) -
How to Distinguish AI-Generated Images from Authentic Photographs
by: Kamali, Negar, et al.
Published: (2024) -
PoseDriver: A Unified Approach to Multi-Category Skeleton Detection for Autonomous Driving
by: Borhani, Yasamin, et al.
Published: (2026) -
ChildCI Framework: Analysis of Motor and Cognitive Development in Children-Computer Interaction for Age Detection
by: Ruiz-Garcia, Juan Carlos, et al.
Published: (2022)