Saved in:
| Main Authors: | Moodley, Perusha, Kaushik, Pramod, Thambi, Dhillu, Trovinger, Mark, Paruchuri, Praveen, Hong, Xia, Rosman, Benjamin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.01310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MV-GMN: State Space Model for Multi-View Action Recognition
by: Lin, Yuhui, et al.
Published: (2025)
by: Lin, Yuhui, et al.
Published: (2025)
MALT: Multi-scale Action Learning Transformer for Online Action Detection
by: Yang, Zhipeng, et al.
Published: (2024)
by: Yang, Zhipeng, et al.
Published: (2024)
Action Selection Learning for Multi-label Multi-view Action Recognition
by: Nguyen, Trung Thanh, et al.
Published: (2024)
by: Nguyen, Trung Thanh, et al.
Published: (2024)
ViTALS: Vision Transformer for Action Localization in Surgical Nephrectomy
by: Chandra, Soumyadeep, et al.
Published: (2024)
by: Chandra, Soumyadeep, et al.
Published: (2024)
S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition
by: Zheng, Naichuan, et al.
Published: (2026)
by: Zheng, Naichuan, et al.
Published: (2026)
Boundary Discretization and Reliable Classification Network for Temporal Action Detection
by: Fang, Zhenying, et al.
Published: (2023)
by: Fang, Zhenying, et al.
Published: (2023)
MultiFuser: Multimodal Fusion Transformer for Enhanced Driver Action Recognition
by: Wang, Ruoyu, et al.
Published: (2024)
by: Wang, Ruoyu, et al.
Published: (2024)
A Real-Time Human Action Recognition Model for Assisted Living
by: Wang, Yixuan, et al.
Published: (2025)
by: Wang, Yixuan, et al.
Published: (2025)
MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition
by: Nguyen, Trung Thanh, et al.
Published: (2025)
by: Nguyen, Trung Thanh, et al.
Published: (2025)
MS-CLR: Multi-Skeleton Contrastive Learning for Human Action Recognition
by: Kiray, Mert, et al.
Published: (2025)
by: Kiray, Mert, et al.
Published: (2025)
A Universal Action Space for General Behavior Analysis
by: Chang, Hung-Shuo, et al.
Published: (2026)
by: Chang, Hung-Shuo, et al.
Published: (2026)
Multi-Granularity Hand Action Detection
by: Zhe, Ting, et al.
Published: (2023)
by: Zhe, Ting, et al.
Published: (2023)
MVAFormer: RGB-based Multi-View Spatio-Temporal Action Recognition with Transformer
by: Yamane, Taiga, et al.
Published: (2025)
by: Yamane, Taiga, et al.
Published: (2025)
Hierarchical Multi-Stage Transformer Architecture for Context-Aware Temporal Action Localization
by: Ullah, Hayat, et al.
Published: (2025)
by: Ullah, Hayat, et al.
Published: (2025)
Learning Action Hierarchies via Hybrid Geometric Diffusion
by: Kaushik, Arjun Ramesh, et al.
Published: (2026)
by: Kaushik, Arjun Ramesh, et al.
Published: (2026)
DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer
by: Kim, Ho-Joong, et al.
Published: (2025)
by: Kim, Ho-Joong, et al.
Published: (2025)
Multi-level and Multi-modal Action Anticipation
by: Kim, Seulgi, et al.
Published: (2025)
by: Kim, Seulgi, et al.
Published: (2025)
Multi-Stage Boundary-Aware Transformer Network for Action Segmentation in Untrimmed Surgical Videos
by: Shuvo, Rezowan, et al.
Published: (2025)
by: Shuvo, Rezowan, et al.
Published: (2025)
MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization
by: Fang, Zhenying, et al.
Published: (2025)
by: Fang, Zhenying, et al.
Published: (2025)
MultiModal Action Conditioned Video Generation
by: Li, Yichen, et al.
Published: (2025)
by: Li, Yichen, et al.
Published: (2025)
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
by: Biswas, Shristi Das, et al.
Published: (2025)
by: Biswas, Shristi Das, et al.
Published: (2025)
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
by: Liang, Zhixuan, et al.
Published: (2025)
by: Liang, Zhixuan, et al.
Published: (2025)
MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion
by: Nguyen, Trung Thanh, et al.
Published: (2025)
by: Nguyen, Trung Thanh, et al.
Published: (2025)
PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction
by: Chen, Shizhe, et al.
Published: (2026)
by: Chen, Shizhe, et al.
Published: (2026)
Friends Across Time: Multi-Scale Action Segmentation Transformer for Surgical Phase Recognition
by: Zhang, Bokai, et al.
Published: (2024)
by: Zhang, Bokai, et al.
Published: (2024)
Multi-Stage Contrastive Regression for Action Quality Assessment
by: An, Qi, et al.
Published: (2024)
by: An, Qi, et al.
Published: (2024)
Dual DETRs for Multi-Label Temporal Action Detection
by: Zhu, Yuhan, et al.
Published: (2024)
by: Zhu, Yuhan, et al.
Published: (2024)
MMAD: Multi-label Micro-Action Detection in Videos
by: Li, Kun, et al.
Published: (2024)
by: Li, Kun, et al.
Published: (2024)
Multi-task Learning For Joint Action and Gesture Recognition
by: Spathis, Konstantinos, et al.
Published: (2025)
by: Spathis, Konstantinos, et al.
Published: (2025)
Advancing Compressed Video Action Recognition through Progressive Knowledge Distillation
by: Soufleri, Efstathia, et al.
Published: (2024)
by: Soufleri, Efstathia, et al.
Published: (2024)
GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos
by: Souček, Tomáš, et al.
Published: (2023)
by: Souček, Tomáš, et al.
Published: (2023)
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
by: Nguyen, Trung Thanh, et al.
Published: (2024)
by: Nguyen, Trung Thanh, et al.
Published: (2024)
Grounding Actions in Camera Space: Observation-Centric Vision-Language-Action Policy
by: Zhang, Tianyi, et al.
Published: (2025)
by: Zhang, Tianyi, et al.
Published: (2025)
SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Human Action Segmentation
by: Liu, Qi, et al.
Published: (2023)
by: Liu, Qi, et al.
Published: (2023)
HiMemFormer: Hierarchical Memory-Aware Transformer for Multi-Agent Action Anticipation
by: Wang, Zirui, et al.
Published: (2024)
by: Wang, Zirui, et al.
Published: (2024)
Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space
by: Nakagawa, Ren, et al.
Published: (2025)
by: Nakagawa, Ren, et al.
Published: (2025)
An Effective-Efficient Approach for Dense Multi-Label Action Detection
by: Sardari, Faegheh, et al.
Published: (2024)
by: Sardari, Faegheh, et al.
Published: (2024)
Multi-Level LVLM Guidance for Untrimmed Video Action Recognition
by: Peng, Liyang, et al.
Published: (2025)
by: Peng, Liyang, et al.
Published: (2025)
MAMMA: Markerless & Automatic Multi-Person Motion Action Capture
by: Cuevas-Velasquez, Hanz, et al.
Published: (2025)
by: Cuevas-Velasquez, Hanz, et al.
Published: (2025)
ActionParty: Multi-Subject Action Binding in Generative Video Games
by: Pondaven, Alexander, et al.
Published: (2026)
by: Pondaven, Alexander, et al.
Published: (2026)
Similar Items
-
MV-GMN: State Space Model for Multi-View Action Recognition
by: Lin, Yuhui, et al.
Published: (2025) -
MALT: Multi-scale Action Learning Transformer for Online Action Detection
by: Yang, Zhipeng, et al.
Published: (2024) -
Action Selection Learning for Multi-label Multi-view Action Recognition
by: Nguyen, Trung Thanh, et al.
Published: (2024) -
ViTALS: Vision Transformer for Action Localization in Surgical Nephrectomy
by: Chandra, Soumyadeep, et al.
Published: (2024) -
S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition
by: Zheng, Naichuan, et al.
Published: (2026)