:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Xinyu, Jiang, Zheheng, Zhou, Feixiang, Zhu, Yihang, Lv, Na, Xing, Nan, Canagarajah, Nishan, Zhou, Huiyu
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2510.10682
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation
by: Zhou, Feixiang, et al.
Published: (2023)

Cross-Skeleton Interaction Graph Aggregation Network for Representation Learning of Mouse Social Behaviour
by: Zhou, Feixiang, et al.
Published: (2022)

Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization
by: Zhou, Feixiang, et al.
Published: (2024)

Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
by: Xie, Liping, et al.
Published: (2025)

Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization
by: Su, Rui, et al.
Published: (2019)

OnlineTAS: An Online Baseline for Temporal Action Segmentation
by: Zhong, Qing, et al.
Published: (2024)

GigaWorld-Policy: An Efficient Action-Centered World--Action Model
by: Ye, Angen, et al.
Published: (2026)

Modelling Spatio-Temporal Interactions For Compositional Action Recognition
by: Rajendiran, Ramanathan, et al.
Published: (2023)

TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
by: Cheng, Jen-Hao, et al.
Published: (2025)

Online Temporal Action Localization with Memory-Augmented Transformer
by: Song, Youngkil, et al.
Published: (2024)

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
by: Lv, Qi, et al.
Published: (2025)

Recovering Complete Actions for Cross-dataset Skeleton Action Recognition
by: Liu, Hanchao, et al.
Published: (2024)

OZ-TAL: Online Zero-Shot Temporal Action Localization
by: Han, Chaolei, et al.
Published: (2026)

The Role of Video Generation in Enhancing Data-Limited Action Understanding
by: Li, Wei, et al.
Published: (2025)

Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space
by: Nakagawa, Ren, et al.
Published: (2025)

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
by: Yang, Le, et al.
Published: (2024)

MALT: Multi-scale Action Learning Transformer for Online Action Detection
by: Yang, Zhipeng, et al.
Published: (2024)

RotVLA: Rotational Latent Action for Vision-Language-Action Model
by: Li, Qiwei, et al.
Published: (2026)

Temporal Action Localization with Cross Layer Task Decoupling and Refinement
by: Li, Qiang, et al.
Published: (2024)

ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars
by: Peng, Ziqiao, et al.
Published: (2025)

Repetitive Action Counting with Hybrid Temporal Relation Modeling
by: Li, Kun, et al.
Published: (2024)

HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization
by: Reza, Sakib, et al.
Published: (2024)

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
by: Zhou, Jiaming, et al.
Published: (2024)

SkeFi: Cross-Modal Knowledge Transfer for Wireless Skeleton-Based Action Recognition
by: Huang, Shunyu, et al.
Published: (2026)

Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models
by: Ponbagavathi, Thinesh Thiyakesan, et al.
Published: (2024)

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-resolution Information in Temporal Domain
by: Su, Rui, et al.
Published: (2025)

Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models
by: Wu, Rining, et al.
Published: (2024)

TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
by: Wang, Yilong, et al.
Published: (2024)

The DAWN of World-Action Interactive Models
by: Lu, Hongbo, et al.
Published: (2026)

Insights from Visual Cognition: Understanding Human Action Dynamics with Overall Glance and Refined Gaze Transformer
by: Xing, Bohao, et al.
Published: (2026)

Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models
by: Markham, Georgia, et al.
Published: (2024)

A Unified Attention U-Net Framework for Cross-Modality Tumor Segmentation in MRI and CT
by: Rai, Nishan, et al.
Published: (2026)

CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization
by: Xia, Rui, et al.
Published: (2025)

FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection
by: Zhu, Xinnan, et al.
Published: (2025)

PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction
by: Peng, Nan, et al.
Published: (2024)

Generative Hierarchical Temporal Transformer for Hand Pose and Action Modeling
by: Wen, Yilin, et al.
Published: (2023)

Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective
by: Truong, Thanh-Dat, et al.
Published: (2023)

The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024
by: Han, Yinan, et al.
Published: (2024)

Dual DETRs for Multi-Label Temporal Action Detection
by: Zhu, Yuhan, et al.
Published: (2024)

ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model
by: Zhou, Zhongyi, et al.
Published: (2025)