:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Peirone, Simone Alberto, Pistilli, Francesca, Alliegro, Antonio, Tommasi, Tatiana, Averta, Giuseppe
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.02487
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives
by: Peirone, Simone Alberto, et al.
Published: (2024)

Learning reusable concepts across different egocentric video understanding tasks
by: Peirone, Simone Alberto, et al.
Published: (2025)

FORESCENE: FOREcasting human activity via latent SCENE graphs diffusion
by: Alliegro, Antonio, et al.
Published: (2025)

HiERO-StepG @ Ego4D Step Grounding Challenge: hierarchical activity understanding enables zero-shot step grounding
by: Zenotto, Andrea, et al.
Published: (2026)

HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
by: Peirone, Simone Alberto, et al.
Published: (2025)

Egocentric zone-aware action recognition across environments
by: Peirone, Simone Alberto, et al.
Published: (2024)

Domain Generalization using Action Sequences for Egocentric Action Recognition
by: Nasirimajd, Amirshayan, et al.
Published: (2025)

PEM: Prototype-based Efficient MaskFormer for Image Segmentation
by: Cavagnero, Niccolò, et al.
Published: (2024)

EgoSound: Benchmarking Sound Understanding in Egocentric Videos
by: Zhu, Bingwen, et al.
Published: (2026)

EgoGraph: Temporal Knowledge Graph for Egocentric Video Understanding
by: Sun, Shitong, et al.
Published: (2026)

Transient Fault Tolerant Semantic Segmentation for Autonomous Driving
by: Iurada, Leonardo, et al.
Published: (2024)

Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding
by: Nagrani, Arsha, et al.
Published: (2026)

EgoVLM: Policy Optimization for Egocentric Video Understanding
by: Vinod, Ashwin, et al.
Published: (2025)

EgoInteract: Synthetic Egocentric Videos Generation for Interaction Understanding and Anticipation
by: Leonardi, Rosario, et al.
Published: (2026)

Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
by: Wu, Tz-Ying, et al.
Published: (2024)

EgoAVU: Egocentric Audio-Visual Understanding
by: Seth, Ashish, et al.
Published: (2026)

Exo2Ego: Exocentric Knowledge Guided MLLM for Egocentric Video Understanding
by: Zhang, Haoyu, et al.
Published: (2025)

EgoPoints: Advancing Point Tracking for Egocentric Videos
by: Darkhalil, Ahmad, et al.
Published: (2024)

Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos
by: Plizzari, Chiara, et al.
Published: (2025)

EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting
by: Zhang, Daiwei, et al.
Published: (2024)

EgoTV: Egocentric Task Verification from Natural Language Task Descriptions
by: Hazra, Rishi, et al.
Published: (2023)

An Outlook into the Future of Egocentric Vision
by: Plizzari, Chiara, et al.
Published: (2023)

EgoVITA: Learning to Plan and Verify for Egocentric Video Reasoning
by: Kulkarni, Yogesh, et al.
Published: (2025)

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
by: Pei, Baoqi, et al.
Published: (2024)

EgoLCD: Egocentric Video Generation with Long Context Diffusion
by: Zhang, Liuzhou, et al.
Published: (2025)

EgoTL: Egocentric Think-Aloud Chains for Long-Horizon Tasks
by: Liu, Lulin, et al.
Published: (2026)

EgoX: Egocentric Video Generation from a Single Exocentric Video
by: Kang, Taewoong, et al.
Published: (2025)

A Second-Order Perspective on Pruning at Initialization and Knowledge Transfer
by: Iurada, Leonardo, et al.
Published: (2025)

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI
by: Cheng, Sijie, et al.
Published: (2024)

EgoMotion: Hierarchical Reasoning and Diffusion for Egocentric Vision-Language Motion Generation
by: Hou, Ruibing, et al.
Published: (2026)

EgoMimic: Scaling Imitation Learning via Egocentric Video
by: Kareer, Simar, et al.
Published: (2024)

The revenge of BiSeNet: Efficient Multi-Task Image Segmentation
by: Rosi, Gabriele, et al.
Published: (2024)

EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
by: Hummel, Thomas, et al.
Published: (2024)

Ego-Grounding for Personalized Question-Answering in Egocentric Videos
by: Xiao, Junbin, et al.
Published: (2026)

EgoIntent: An Egocentric Step-level Benchmark for Understanding What, Why, and Next
by: Pan, Ye, et al.
Published: (2026)

SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
by: Cuttano, Claudia, et al.
Published: (2024)

Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
by: Chi, Seunggeun, et al.
Published: (2024)

EgoLoc: A Generalizable Solution for Temporal Interaction Localization in Egocentric Videos
by: Ma, Junyi, et al.
Published: (2025)

A Modern Take on Visual Relationship Reasoning for Grasp Planning
by: Rabino, Paolo, et al.
Published: (2024)

SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video
by: Valdez, Hector A., et al.
Published: (2024)