Saved in:
| Main Authors: | Yang, Xinyu, Jiang, Zheheng, Zhou, Feixiang, Zhu, Yihang, Lv, Na, Xing, Nan, Canagarajah, Nishan, Zhou, Huiyu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.10682 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation
by: Zhou, Feixiang, et al.
Published: (2023)
by: Zhou, Feixiang, et al.
Published: (2023)
Cross-Skeleton Interaction Graph Aggregation Network for Representation Learning of Mouse Social Behaviour
by: Zhou, Feixiang, et al.
Published: (2022)
by: Zhou, Feixiang, et al.
Published: (2022)
Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization
by: Zhou, Feixiang, et al.
Published: (2024)
by: Zhou, Feixiang, et al.
Published: (2024)
Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
by: Xie, Liping, et al.
Published: (2025)
by: Xie, Liping, et al.
Published: (2025)
Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization
by: Su, Rui, et al.
Published: (2019)
by: Su, Rui, et al.
Published: (2019)
OnlineTAS: An Online Baseline for Temporal Action Segmentation
by: Zhong, Qing, et al.
Published: (2024)
by: Zhong, Qing, et al.
Published: (2024)
GigaWorld-Policy: An Efficient Action-Centered World--Action Model
by: Ye, Angen, et al.
Published: (2026)
by: Ye, Angen, et al.
Published: (2026)
Modelling Spatio-Temporal Interactions For Compositional Action Recognition
by: Rajendiran, Ramanathan, et al.
Published: (2023)
by: Rajendiran, Ramanathan, et al.
Published: (2023)
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
by: Cheng, Jen-Hao, et al.
Published: (2025)
by: Cheng, Jen-Hao, et al.
Published: (2025)
Online Temporal Action Localization with Memory-Augmented Transformer
by: Song, Youngkil, et al.
Published: (2024)
by: Song, Youngkil, et al.
Published: (2024)
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
by: Lv, Qi, et al.
Published: (2025)
by: Lv, Qi, et al.
Published: (2025)
Recovering Complete Actions for Cross-dataset Skeleton Action Recognition
by: Liu, Hanchao, et al.
Published: (2024)
by: Liu, Hanchao, et al.
Published: (2024)
OZ-TAL: Online Zero-Shot Temporal Action Localization
by: Han, Chaolei, et al.
Published: (2026)
by: Han, Chaolei, et al.
Published: (2026)
The Role of Video Generation in Enhancing Data-Limited Action Understanding
by: Li, Wei, et al.
Published: (2025)
by: Li, Wei, et al.
Published: (2025)
Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space
by: Nakagawa, Ren, et al.
Published: (2025)
by: Nakagawa, Ren, et al.
Published: (2025)
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
by: Yang, Le, et al.
Published: (2024)
by: Yang, Le, et al.
Published: (2024)
MALT: Multi-scale Action Learning Transformer for Online Action Detection
by: Yang, Zhipeng, et al.
Published: (2024)
by: Yang, Zhipeng, et al.
Published: (2024)
RotVLA: Rotational Latent Action for Vision-Language-Action Model
by: Li, Qiwei, et al.
Published: (2026)
by: Li, Qiwei, et al.
Published: (2026)
Temporal Action Localization with Cross Layer Task Decoupling and Refinement
by: Li, Qiang, et al.
Published: (2024)
by: Li, Qiang, et al.
Published: (2024)
ActAvatar: Temporally-Aware Precise Action Control for Talking Avatars
by: Peng, Ziqiao, et al.
Published: (2025)
by: Peng, Ziqiao, et al.
Published: (2025)
Repetitive Action Counting with Hybrid Temporal Relation Modeling
by: Li, Kun, et al.
Published: (2024)
by: Li, Kun, et al.
Published: (2024)
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization
by: Reza, Sakib, et al.
Published: (2024)
by: Reza, Sakib, et al.
Published: (2024)
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
by: Zhou, Jiaming, et al.
Published: (2024)
by: Zhou, Jiaming, et al.
Published: (2024)
SkeFi: Cross-Modal Knowledge Transfer for Wireless Skeleton-Based Action Recognition
by: Huang, Shunyu, et al.
Published: (2026)
by: Huang, Shunyu, et al.
Published: (2026)
Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models
by: Ponbagavathi, Thinesh Thiyakesan, et al.
Published: (2024)
by: Ponbagavathi, Thinesh Thiyakesan, et al.
Published: (2024)
Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-resolution Information in Temporal Domain
by: Su, Rui, et al.
Published: (2025)
by: Su, Rui, et al.
Published: (2025)
Aligning Neuronal Coding of Dynamic Visual Scenes with Foundation Vision Models
by: Wu, Rining, et al.
Published: (2024)
by: Wu, Rining, et al.
Published: (2024)
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
by: Wang, Yilong, et al.
Published: (2024)
by: Wang, Yilong, et al.
Published: (2024)
The DAWN of World-Action Interactive Models
by: Lu, Hongbo, et al.
Published: (2026)
by: Lu, Hongbo, et al.
Published: (2026)
Insights from Visual Cognition: Understanding Human Action Dynamics with Overall Glance and Refined Gaze Transformer
by: Xing, Bohao, et al.
Published: (2026)
by: Xing, Bohao, et al.
Published: (2026)
Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models
by: Markham, Georgia, et al.
Published: (2024)
by: Markham, Georgia, et al.
Published: (2024)
A Unified Attention U-Net Framework for Cross-Modality Tumor Segmentation in MRI and CT
by: Rai, Nishan, et al.
Published: (2026)
by: Rai, Nishan, et al.
Published: (2026)
CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization
by: Xia, Rui, et al.
Published: (2025)
by: Xia, Rui, et al.
Published: (2025)
FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection
by: Zhu, Xinnan, et al.
Published: (2025)
by: Zhu, Xinnan, et al.
Published: (2025)
PrevPredMap: Exploring Temporal Modeling with Previous Predictions for Online Vectorized HD Map Construction
by: Peng, Nan, et al.
Published: (2024)
by: Peng, Nan, et al.
Published: (2024)
Generative Hierarchical Temporal Transformer for Hand Pose and Action Modeling
by: Wen, Yilin, et al.
Published: (2023)
by: Wen, Yilin, et al.
Published: (2023)
Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective
by: Truong, Thanh-Dat, et al.
Published: (2023)
by: Truong, Thanh-Dat, et al.
Published: (2023)
The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024
by: Han, Yinan, et al.
Published: (2024)
by: Han, Yinan, et al.
Published: (2024)
Dual DETRs for Multi-Label Temporal Action Detection
by: Zhu, Yuhan, et al.
Published: (2024)
by: Zhu, Yuhan, et al.
Published: (2024)
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model
by: Zhou, Zhongyi, et al.
Published: (2025)
by: Zhou, Zhongyi, et al.
Published: (2025)
Similar Items
-
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation
by: Zhou, Feixiang, et al.
Published: (2023) -
Cross-Skeleton Interaction Graph Aggregation Network for Representation Learning of Mouse Social Behaviour
by: Zhou, Feixiang, et al.
Published: (2022) -
Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization
by: Zhou, Feixiang, et al.
Published: (2024) -
Probabilistic Temporal Masked Attention for Cross-view Online Action Detection
by: Xie, Liping, et al.
Published: (2025) -
Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization
by: Su, Rui, et al.
Published: (2019)