:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	John, Shahla
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2507.22421
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Temporal Alignment-Free Video Matching for Few-shot Action Recognition
by: Lee, SuBeen, et al.
Published: (2025)

Tracking the Truth: Object-Centric Spatio-Temporal Monitoring for Video Large Language Models
by: Cao, Tri, et al.
Published: (2026)

PolypSegTrack: Unified Foundation Model for Colonoscopy Video Analysis
by: Choudhuri, Anwesa, et al.
Published: (2025)

One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton Matching
by: Yang, Siyuan, et al.
Published: (2023)

EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition
by: Liu, Jingyu, et al.
Published: (2024)

UniSOT: A Unified Framework for Multi-Modality Single Object Tracking
by: Ma, Yinchao, et al.
Published: (2025)

RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
by: Lin, Yijing, et al.
Published: (2025)

LSTC-MDA: A Unified Framework for Long-Short Term Temporal Convolution and Mixed Data Augmentation in Skeleton-Based Action Recognition
by: Ding, Feng, et al.
Published: (2025)

Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis
by: Kodathala, Sai Varun, et al.
Published: (2025)

Temporal and Spatial Feature Fusion Framework for Dynamic Micro Expression Recognition
by: Liu, Feng, et al.
Published: (2025)

UASTrack: A Unified Adaptive Selection Framework with Modality-Customization in Single Object Tracking
by: Wang, He, et al.
Published: (2025)

CaptionFormer: Unified Segmentation, Tracking, and Captioning for Spatio-Temporal Objects
by: Fiastre, Gabriel, et al.
Published: (2025)

Real-Time Manipulation Action Recognition with a Factorized Graph Sequence Encoder
by: Erdogan, Enes, et al.
Published: (2025)

Exploring Explainability in Video Action Recognition
by: Saha, Avinab, et al.
Published: (2024)

RealWonder: Real-Time Physical Action-Conditioned Video Generation
by: Liu, Wei, et al.
Published: (2026)

Fire on Motion: Optimizing Video Pass-bands for Efficient Spiking Action Recognition
by: Ye, Shuhan, et al.
Published: (2026)

Real-Time Human Action Recognition on Embedded Platforms
by: Wang, Ruiqi, et al.
Published: (2024)

Efficient Event-Based Object Detection: A Hybrid Neural Network with Spatial and Temporal Attention
by: Ahmed, Soikat Hasan, et al.
Published: (2024)

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
by: Yuan, Yuqian, et al.
Published: (2024)

Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
by: Xu, Huilin, et al.
Published: (2025)

A Survey on Backbones for Deep Video Action Recognition
by: Tang, Zixuan, et al.
Published: (2024)

YOLO26: An Analysis of NMS-Free End to End Framework for Real-Time Object Detection
by: Chakrabarty, Sudip
Published: (2026)

STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models
by: Wang, Zerui, et al.
Published: (2024)

Towards Efficient Real-Time Video Motion Transfer via Generative Time Series Modeling
by: Haque, Tasmiah, et al.
Published: (2025)

Improving Skeleton-based Action Recognition with Interactive Object Information
by: Wen, Hao, et al.
Published: (2025)

Skeleton-Based Action Recognition with Spatial-Structural Graph Convolution
by: Wang, Jingyao, et al.
Published: (2024)

SV3.3B: A Sports Video Understanding Model for Action Recognition
by: Kodathala, Sai Varun, et al.
Published: (2025)

EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
by: Abdelkawy, Ahmed, et al.
Published: (2024)

Exploring Ordinal Bias in Action Recognition for Instructional Videos
by: Kim, Joochan, et al.
Published: (2025)

SkateboardAI: The Coolest Video Action Recognition for Skateboarding
by: Chen, Hanxiao
Published: (2023)

Flatten: Video Action Recognition is an Image Classification task
by: Chen, Junlin, et al.
Published: (2024)

Efficient Egocentric Action Recognition with Multimodal Data
by: Calzavara, Marco, et al.
Published: (2025)

7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting
by: Gao, Zhongpai, et al.
Published: (2025)

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs
by: Zhang, Jianrui, et al.
Published: (2026)

Video Diffusion Models Excel at Tracking Similar-Looking Objects Without Supervision
by: Zhang, Chenshuang, et al.
Published: (2025)

Action Recognition in Real-World Ambient Assisted Living Environment
by: Zakka, Vincent Gbouna, et al.
Published: (2025)

Highly Efficient and Unsupervised Framework for Moving Object Detection in Satellite Videos
by: Xiao, C., et al.
Published: (2024)

Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation
by: Luo, Yuanhao, et al.
Published: (2026)

ATSTrack: Enhancing Visual-Language Tracking by Aligning Temporal and Spatial Scales
by: Zhen, Yihao, et al.
Published: (2025)

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics
by: Tse, Tze Ho Elden, et al.
Published: (2025)