Saved in:
| Main Authors: | Wan, Wenlong, Zheng, Weiying, Xiang, Tianyi, Li, Guiqing, He, Shengfeng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.13320 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition
by: Liu, Jingyu, et al.
Published: (2024)
by: Liu, Jingyu, et al.
Published: (2024)
Box2Flow: Instance-based Action Flow Graphs from Videos
by: Li, Jiatong, et al.
Published: (2024)
by: Li, Jiatong, et al.
Published: (2024)
Unfolding 3D Gaussian Splatting via Iterative Gaussian Synopsis
by: Lu, Yuqin, et al.
Published: (2026)
by: Lu, Yuqin, et al.
Published: (2026)
EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models
by: Jiang, Feng, et al.
Published: (2025)
by: Jiang, Feng, et al.
Published: (2025)
Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning
by: Zhang, Xin, et al.
Published: (2023)
by: Zhang, Xin, et al.
Published: (2023)
Taylor Videos for Action Recognition
by: Wang, Lei, et al.
Published: (2024)
by: Wang, Lei, et al.
Published: (2024)
HFGCN:Hypergraph Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
by: Dong, Pengcheng, et al.
Published: (2025)
by: Dong, Pengcheng, et al.
Published: (2025)
RALACs: Action Recognition in Autonomous Vehicles using Interaction Encoding and Optical Flow
by: Zhou, Eddy, et al.
Published: (2022)
by: Zhou, Eddy, et al.
Published: (2022)
Spatio-Temporal LLM: Reasoning about Environments and Actions
by: Zheng, Haozhen, et al.
Published: (2025)
by: Zheng, Haozhen, et al.
Published: (2025)
Multi-State-Action Tokenisation in Decision Transformers for Multi-Discrete Action Spaces
by: Moodley, Perusha, et al.
Published: (2024)
by: Moodley, Perusha, et al.
Published: (2024)
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
by: Wang, Yucen, et al.
Published: (2024)
by: Wang, Yucen, et al.
Published: (2024)
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
by: Guruprasad, Pranav, et al.
Published: (2025)
by: Guruprasad, Pranav, et al.
Published: (2025)
NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows
by: Tarasov, Denis, et al.
Published: (2025)
by: Tarasov, Denis, et al.
Published: (2025)
Real-Time Human Action Recognition on Embedded Platforms
by: Wang, Ruiqi, et al.
Published: (2024)
by: Wang, Ruiqi, et al.
Published: (2024)
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
by: Ilic, Filip, et al.
Published: (2024)
by: Ilic, Filip, et al.
Published: (2024)
FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models
by: An, Xinyuan, et al.
Published: (2026)
by: An, Xinyuan, et al.
Published: (2026)
Video RWKV:Video Action Recognition Based RWKV
by: Yin, Zhuowen, et al.
Published: (2024)
by: Yin, Zhuowen, et al.
Published: (2024)
World Action Models are Zero-shot Policies
by: Ye, Seonghyeon, et al.
Published: (2026)
by: Ye, Seonghyeon, et al.
Published: (2026)
Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes
by: Kung, Chi-Hsi, et al.
Published: (2023)
by: Kung, Chi-Hsi, et al.
Published: (2023)
SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
by: Li, Sheng-Wei, et al.
Published: (2024)
by: Li, Sheng-Wei, et al.
Published: (2024)
Detecting Informative Channels: ActionFormer
by: Zhao, Kunpeng, et al.
Published: (2025)
by: Zhao, Kunpeng, et al.
Published: (2025)
Human Action Anticipation: A Survey
by: Lai, Bolin, et al.
Published: (2024)
by: Lai, Bolin, et al.
Published: (2024)
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
by: Liang, Zhixuan, et al.
Published: (2025)
by: Liang, Zhixuan, et al.
Published: (2025)
Improving Vision-Language-Action Model with Online Reinforcement Learning
by: Guo, Yanjiang, et al.
Published: (2025)
by: Guo, Yanjiang, et al.
Published: (2025)
About Time: Advances, Challenges, and Outlooks of Action Understanding
by: Stergiou, Alexandros, et al.
Published: (2024)
by: Stergiou, Alexandros, et al.
Published: (2024)
Test-Time Training for Visual Foresight Vision-Language-Action Models
by: Park, Sangwu, et al.
Published: (2026)
by: Park, Sangwu, et al.
Published: (2026)
Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
by: Li, Kun, et al.
Published: (2024)
by: Li, Kun, et al.
Published: (2024)
Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis
by: Sigillo, Luigi, et al.
Published: (2025)
by: Sigillo, Luigi, et al.
Published: (2025)
FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models
by: Zheng, Weiying, et al.
Published: (2025)
by: Zheng, Weiying, et al.
Published: (2025)
Action-Agnostic Point-Level Supervision for Temporal Action Detection
by: Yoshida, Shuhei M., et al.
Published: (2024)
by: Yoshida, Shuhei M., et al.
Published: (2024)
Multi-level and Multi-modal Action Anticipation
by: Kim, Seulgi, et al.
Published: (2025)
by: Kim, Seulgi, et al.
Published: (2025)
When Spatial meets Temporal in Action Recognition
by: Chen, Huilin, et al.
Published: (2024)
by: Chen, Huilin, et al.
Published: (2024)
Classification of Tennis Actions Using Deep Learning
by: Hovad, Emil, et al.
Published: (2024)
by: Hovad, Emil, et al.
Published: (2024)
Motus: A Unified Latent Action World Model
by: Bi, Hongzhe, et al.
Published: (2025)
by: Bi, Hongzhe, et al.
Published: (2025)
Modular Retrieval-Augmented Generalization for Human Action Recognition
by: Liao, Peng, et al.
Published: (2026)
by: Liao, Peng, et al.
Published: (2026)
Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
by: Liu, Haofeng, et al.
Published: (2024)
by: Liu, Haofeng, et al.
Published: (2024)
ActionParty: Multi-Subject Action Binding in Generative Video Games
by: Pondaven, Alexander, et al.
Published: (2026)
by: Pondaven, Alexander, et al.
Published: (2026)
Universal Pose Pretraining for Generalizable Vision-Language-Action Policies
by: Lin, Haitao, et al.
Published: (2026)
by: Lin, Haitao, et al.
Published: (2026)
Dense Policy: Bidirectional Autoregressive Learning of Actions
by: Su, Yue, et al.
Published: (2025)
by: Su, Yue, et al.
Published: (2025)
Group Relative Augmentation for Data Efficient Action Detection
by: Patel, Deep Anil, et al.
Published: (2025)
by: Patel, Deep Anil, et al.
Published: (2025)
Similar Items
-
EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition
by: Liu, Jingyu, et al.
Published: (2024) -
Box2Flow: Instance-based Action Flow Graphs from Videos
by: Li, Jiatong, et al.
Published: (2024) -
Unfolding 3D Gaussian Splatting via Iterative Gaussian Synopsis
by: Lu, Yuqin, et al.
Published: (2026) -
EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models
by: Jiang, Feng, et al.
Published: (2025) -
Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning
by: Zhang, Xin, et al.
Published: (2023)