:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sato, Yuji, Ishii, Yasunori, Yamashita, Takayoshi
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2508.00374
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Panoramic Distortion-Aware Tokenization for Person Detection and Localization in Overhead Fisheye Images
by: Wakai, Nobuhiko, et al.
Published: (2025)

Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption
by: Wakai, Nobuhiko, et al.
Published: (2023)

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
by: Zhao, Qi, et al.
Published: (2023)

Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
by: Cao, Congqi, et al.
Published: (2025)

Multimodal Large Models Are Effective Action Anticipators
by: Wang, Binglu, et al.
Published: (2025)

From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation
by: Liu, Xin, et al.
Published: (2024)

Action-Guided Attention for Video Action Anticipation
by: Tai, Tsung-Ming, et al.
Published: (2026)

A Survey on Deep Learning Techniques for Action Anticipation
by: Zhong, Zeyun, et al.
Published: (2023)

Learning Multiple Object States from Actions via Large Language Models
by: Tateno, Masatoshi, et al.
Published: (2024)

Intention Action Anticipation Model with Guide-Feedback Loop Mechanism
by: Ma, Zongnan, et al.
Published: (2024)

Intention-Guided Cognitive Reasoning for Egocentric Long-Term Action Anticipation
by: Chu, Qiaohui, et al.
Published: (2025)

Human Action Anticipation: A Survey
by: Lai, Bolin, et al.
Published: (2024)

Interaction Region Visual Transformer for Egocentric Action Anticipation
by: Roy, Debaditya, et al.
Published: (2022)

Understanding Multimodal Complementarity for Single-Frame Action Anticipation
by: Benavent-Lledo, Manuel, et al.
Published: (2026)

Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
by: Mittal, Himangi, et al.
Published: (2024)

Semantically Guided Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)

Semantically Guided Representation Learning For Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)

Interpretable Long-term Action Quality Assessment
by: Dong, Xu, et al.
Published: (2024)

See It Before You Grab It: Deep Learning-based Action Anticipation in Basketball
by: Roy, Arnau Barrera, et al.
Published: (2025)

Multi-level and Multi-modal Action Anticipation
by: Kim, Seulgi, et al.
Published: (2025)

Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025
by: Chu, Qiaohui, et al.
Published: (2025)

Bidirectional Progressive Transformer for Interaction Intention Anticipation
by: Zhang, Zichen, et al.
Published: (2024)

ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation
by: Gong, Dayoung, et al.
Published: (2024)

Long-term Pre-training for Temporal Action Detection with Transformers
by: Kim, Jihwan, et al.
Published: (2024)

Dense Policy: Bidirectional Autoregressive Learning of Actions
by: Su, Yue, et al.
Published: (2025)

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation
by: Wasim, Syed Talal, et al.
Published: (2025)

Domain Generalization using Action Sequences for Egocentric Action Recognition
by: Nasirimajd, Amirshayan, et al.
Published: (2025)

ActionVOS: Actions as Prompts for Video Object Segmentation
by: Ouyang, Liangyang, et al.
Published: (2024)

Attention-Driven Multimodal Alignment for Long-term Action Quality Assessment
by: Wang, Xin, et al.
Published: (2025)

Action Anticipation at a Glimpse: To What Extent Can Multimodal Cues Replace Video?
by: Benavent-Lledo, Manuel, et al.
Published: (2025)

Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence
by: Huang, Wenbo, et al.
Published: (2024)

Learning to Reason and Navigate: Parameter Efficient Action Planning with Large Language Models
by: Mohammadi, Bahram, et al.
Published: (2025)

JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026
by: Chu, Qiaohui, et al.
Published: (2026)

QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024
by: Zhong, Zeyun, et al.
Published: (2024)

SWAG: Long-term Surgical Workflow Prediction with Generative-based Anticipation
by: Boels, Maxence, et al.
Published: (2024)

High-Speed Vision Improves Zero-Shot Semantic Understanding of Human Actions
by: Cao, Yongpeng, et al.
Published: (2026)

Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention
by: Ozdel, Suleyman, et al.
Published: (2024)

FreezeVLA: Action-Freezing Attacks against Vision-Language-Action Models
by: Wang, Xin, et al.
Published: (2025)

DriveMA: Driving Vision-Language-Action Models with verifiable Meta-Actions
by: Zheng, Weicheng, et al.
Published: (2026)

HiMemFormer: Hierarchical Memory-Aware Transformer for Multi-Agent Action Anticipation
by: Wang, Zirui, et al.
Published: (2024)