:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wan, Wenlong, Zheng, Weiying, Xiang, Tianyi, Li, Guiqing, He, Shengfeng
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2506.13320
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition
by: Liu, Jingyu, et al.
Published: (2024)

Box2Flow: Instance-based Action Flow Graphs from Videos
by: Li, Jiatong, et al.
Published: (2024)

Unfolding 3D Gaussian Splatting via Iterative Gaussian Synopsis
by: Lu, Yuqin, et al.
Published: (2026)

EaqVLA: Encoding-aligned Quantization for Vision-Language-Action Models
by: Jiang, Feng, et al.
Published: (2025)

Spanning Training Progress: Temporal Dual-Depth Scoring (TDDS) for Enhanced Dataset Pruning
by: Zhang, Xin, et al.
Published: (2023)

Taylor Videos for Action Recognition
by: Wang, Lei, et al.
Published: (2024)

HFGCN:Hypergraph Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
by: Dong, Pengcheng, et al.
Published: (2025)

RALACs: Action Recognition in Autonomous Vehicles using Interaction Encoding and Optical Flow
by: Zhou, Eddy, et al.
Published: (2022)

Spatio-Temporal LLM: Reasoning about Environments and Actions
by: Zheng, Haozhen, et al.
Published: (2025)

Multi-State-Action Tokenisation in Decision Transformers for Multi-Discrete Action Spaces
by: Moodley, Perusha, et al.
Published: (2024)

AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
by: Wang, Yucen, et al.
Published: (2024)

Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
by: Guruprasad, Pranav, et al.
Published: (2025)

NinA: Normalizing Flows in Action. Training VLA Models with Normalizing Flows
by: Tarasov, Denis, et al.
Published: (2025)

Real-Time Human Action Recognition on Embedded Platforms
by: Wang, Ruiqi, et al.
Published: (2024)

Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
by: Ilic, Filip, et al.
Published: (2024)

FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models
by: An, Xinyuan, et al.
Published: (2026)

Video RWKV:Video Action Recognition Based RWKV
by: Yin, Zhuowen, et al.
Published: (2024)

World Action Models are Zero-shot Policies
by: Ye, Seonghyeon, et al.
Published: (2026)

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes
by: Kung, Chi-Hsi, et al.
Published: (2023)

SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders
by: Li, Sheng-Wei, et al.
Published: (2024)

Detecting Informative Channels: ActionFormer
by: Zhao, Kunpeng, et al.
Published: (2025)

Human Action Anticipation: A Survey
by: Lai, Bolin, et al.
Published: (2024)

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
by: Liang, Zhixuan, et al.
Published: (2025)

Improving Vision-Language-Action Model with Online Reinforcement Learning
by: Guo, Yanjiang, et al.
Published: (2025)

About Time: Advances, Challenges, and Outlooks of Action Understanding
by: Stergiou, Alexandros, et al.
Published: (2024)

Test-Time Training for Visual Foresight Vision-Language-Action Models
by: Park, Sangwu, et al.
Published: (2026)

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition
by: Li, Kun, et al.
Published: (2024)

Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis
by: Sigillo, Luigi, et al.
Published: (2025)

FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models
by: Zheng, Weiying, et al.
Published: (2025)

Action-Agnostic Point-Level Supervision for Temporal Action Detection
by: Yoshida, Shuhei M., et al.
Published: (2024)

Multi-level and Multi-modal Action Anticipation
by: Kim, Seulgi, et al.
Published: (2025)

When Spatial meets Temporal in Action Recognition
by: Chen, Huilin, et al.
Published: (2024)

Classification of Tennis Actions Using Deep Learning
by: Hovad, Emil, et al.
Published: (2024)

Motus: A Unified Latent Action World Model
by: Bi, Hongzhe, et al.
Published: (2025)

Modular Retrieval-Augmented Generalization for Human Action Recognition
by: Liao, Peng, et al.
Published: (2026)

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
by: Liu, Haofeng, et al.
Published: (2024)

ActionParty: Multi-Subject Action Binding in Generative Video Games
by: Pondaven, Alexander, et al.
Published: (2026)

Universal Pose Pretraining for Generalizable Vision-Language-Action Policies
by: Lin, Haitao, et al.
Published: (2026)

Dense Policy: Bidirectional Autoregressive Learning of Actions
by: Su, Yue, et al.
Published: (2025)

Group Relative Augmentation for Data Efficient Action Detection
by: Patel, Deep Anil, et al.
Published: (2025)