Saved in:
| Main Authors: | Hong, Xin, Da, Longchao, Wei, Hua |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.19818 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)
by: Raoufi, Behnam, et al.
Published: (2025)
Visible Iris Area as a Quality Metric for Reliable Iris Recognition Under Pupil Dilation and Eyelid Occlusion
by: Pessaud, Jack, et al.
Published: (2025)
by: Pessaud, Jack, et al.
Published: (2025)
PyCAT4: A Hierarchical Vision Transformer-based Framework for 3D Human Pose Estimation
by: Yang, Zongyou, et al.
Published: (2025)
by: Yang, Zongyou, et al.
Published: (2025)
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026)
by: Wang, Yiming, et al.
Published: (2026)
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
by: Li, Yayuan, et al.
Published: (2025)
by: Li, Yayuan, et al.
Published: (2025)
SelvaBox: A high-resolution dataset for tropical tree crown detection
by: Baudchon, Hugo, et al.
Published: (2025)
by: Baudchon, Hugo, et al.
Published: (2025)
DeltaVLM: Interactive Remote Sensing Image Change Analysis via Instruction-guided Difference Perception
by: Deng, Pei, et al.
Published: (2025)
by: Deng, Pei, et al.
Published: (2025)
Image-Based Leopard Seal Recognition: Approaches and Challenges in Current Automated Systems
by: Salazar, Jorge Yero, et al.
Published: (2024)
by: Salazar, Jorge Yero, et al.
Published: (2024)
Dense Motion Captioning
by: Xu, Shiyao, et al.
Published: (2025)
by: Xu, Shiyao, et al.
Published: (2025)
TD3Net: A temporal densely connected multi-dilated convolutional network for lipreading
by: Lee, Byung Hoon, et al.
Published: (2025)
by: Lee, Byung Hoon, et al.
Published: (2025)
CoMatcher: Multi-View Collaborative Feature Matching
by: Zhang, Jintao, et al.
Published: (2025)
by: Zhang, Jintao, et al.
Published: (2025)
MoDE: Mixture of Diffusion Experts for Any Occluded Face Recognition
by: Fan, Qiannan, et al.
Published: (2025)
by: Fan, Qiannan, et al.
Published: (2025)
NeuroGaze-Distill: Brain-informed Distillation and Depression-Inspired Geometric Priors for Robust Facial Emotion Recognition
by: Li, Zilin, et al.
Published: (2025)
by: Li, Zilin, et al.
Published: (2025)
Prompt Sensitivity in Vision-Language Grounding: How Small Changes in Wording Affect Object Detection
by: Deka, Dawar Jyoti, et al.
Published: (2026)
by: Deka, Dawar Jyoti, et al.
Published: (2026)
TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR
by: Lentsch, Ted, et al.
Published: (2026)
by: Lentsch, Ted, et al.
Published: (2026)
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
by: Lentsch, Ted, et al.
Published: (2024)
by: Lentsch, Ted, et al.
Published: (2024)
HY-Himmel Technical Report: Hierarchical Interleaved Multi-stream Motion Encoding for Long Video Understanding
by: Jin, Haopeng, et al.
Published: (2026)
by: Jin, Haopeng, et al.
Published: (2026)
SelvaMask: Segmenting Trees in Tropical Forests and Beyond
by: Duguay, Simon-Olivier, et al.
Published: (2026)
by: Duguay, Simon-Olivier, et al.
Published: (2026)
Deep Learning-Based Multi-Object Tracking: A Comprehensive Survey from Foundations to State-of-the-Art
by: Adžemović, Momir
Published: (2025)
by: Adžemović, Momir
Published: (2025)
DeepShade: Enable Shade Simulation by Text-conditioned Image Generation
by: Da, Longchao, et al.
Published: (2025)
by: Da, Longchao, et al.
Published: (2025)
Towards Accurate and Efficient Waste Image Classification: A Hybrid Deep Learning and Machine Learning Approach
by: Nguyen, Ngoc-Bao-Quang, et al.
Published: (2025)
by: Nguyen, Ngoc-Bao-Quang, et al.
Published: (2025)
FeedbackSTS-Det: Sparse Frames-Based Spatio-Temporal Semantic Feedback Network for Moving Infrared Small Target Detection
by: Huang, Yian, et al.
Published: (2026)
by: Huang, Yian, et al.
Published: (2026)
CG-HOI: Contact-Guided 3D Human-Object Interaction Generation
by: Diller, Christian, et al.
Published: (2023)
by: Diller, Christian, et al.
Published: (2023)
Habitat Classification from Ground-Level Imagery Using Deep Neural Networks
by: Shi, Hongrui, et al.
Published: (2025)
by: Shi, Hongrui, et al.
Published: (2025)
Capacity Constraint Analysis Using Object Detection for Smart Manufacturing
by: Ahmad, Hafiz Mughees, et al.
Published: (2024)
by: Ahmad, Hafiz Mughees, et al.
Published: (2024)
SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry
by: Ahmad, Hafiz Mughees, et al.
Published: (2024)
by: Ahmad, Hafiz Mughees, et al.
Published: (2024)
Fairness Without Labels: Pseudo-Balancing for Bias Mitigation in Face Gender Classification
by: Dong, Haohua, et al.
Published: (2025)
by: Dong, Haohua, et al.
Published: (2025)
Do All Vision Transformers Need Registers? A Cross-Architectural Reassessment
by: Baxevanakis, Spiros, et al.
Published: (2026)
by: Baxevanakis, Spiros, et al.
Published: (2026)
Single-Step Reconstruction-Free Anomaly Detection and Segmentation via Diffusion Models
by: Moradi, Mehrdad, et al.
Published: (2025)
by: Moradi, Mehrdad, et al.
Published: (2025)
Efficient Temporally-Aware DeepFake Detection using H.264 Motion Vectors
by: Grönquist, Peter, et al.
Published: (2023)
by: Grönquist, Peter, et al.
Published: (2023)
Dynamic Arthroscopic Navigation System for Anterior Cruciate Ligament Reconstruction Based on Multi-level Memory Architecture
by: Wang, Shuo, et al.
Published: (2025)
by: Wang, Shuo, et al.
Published: (2025)
Distributed Intelligent System Architecture for UAV-Assisted Monitoring of Wind Energy Infrastructure
by: Svystun, Serhii, et al.
Published: (2024)
by: Svystun, Serhii, et al.
Published: (2024)
EgoTraj: Real-World Egocentric Human Trajectory Dataset for Multimodal Prediction
by: Yehia, Ahmad, et al.
Published: (2026)
by: Yehia, Ahmad, et al.
Published: (2026)
A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data
by: Moradi, Mehrdad, et al.
Published: (2025)
by: Moradi, Mehrdad, et al.
Published: (2025)
Shaded Route Planning Using Active Segmentation and Identification of Satellite Images
by: Da, Longchao, et al.
Published: (2024)
by: Da, Longchao, et al.
Published: (2024)
Optimal Transport-Guided Source-Free Adaptation for Face Anti-Spoofing
by: Li, Zhuowei, et al.
Published: (2025)
by: Li, Zhuowei, et al.
Published: (2025)
FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations
by: Diller, Christian, et al.
Published: (2022)
by: Diller, Christian, et al.
Published: (2022)
Detecting AI-Generated Videos with Spiking Neural Networks
by: Jang, Minsuk, et al.
Published: (2026)
by: Jang, Minsuk, et al.
Published: (2026)
Contract-Governed Training for Earth Observation: Observed Service Agreement Graphs and Coverage-Accuracy Trade-offs
by: Du, Wenzhang
Published: (2025)
by: Du, Wenzhang
Published: (2025)
Distance Estimation in Outdoor Driving Environments Using Phase-only Correlation Method with Event Cameras
by: Kobayashi, Masataka, et al.
Published: (2025)
by: Kobayashi, Masataka, et al.
Published: (2025)
Similar Items
-
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025) -
Visible Iris Area as a Quality Metric for Reliable Iris Recognition Under Pupil Dilation and Eyelid Occlusion
by: Pessaud, Jack, et al.
Published: (2025) -
PyCAT4: A Hierarchical Vision Transformer-based Framework for 3D Human Pose Estimation
by: Yang, Zongyou, et al.
Published: (2025) -
Motion-Guided Semantic Alignment with Negative Prompts for Zero-Shot Video Action Recognition
by: Wang, Yiming, et al.
Published: (2026) -
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos
by: Li, Yayuan, et al.
Published: (2025)