Saved in:
| Main Authors: | Peirone, Simone Alberto, Pistilli, Francesca, Alliegro, Antonio, Averta, Giuseppe |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.03037 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives
by: Peirone, Simone Alberto, et al.
Published: (2025)
by: Peirone, Simone Alberto, et al.
Published: (2025)
Learning reusable concepts across different egocentric video understanding tasks
by: Peirone, Simone Alberto, et al.
Published: (2025)
by: Peirone, Simone Alberto, et al.
Published: (2025)
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
by: Peirone, Simone Alberto, et al.
Published: (2025)
by: Peirone, Simone Alberto, et al.
Published: (2025)
FORESCENE: FOREcasting human activity via latent SCENE graphs diffusion
by: Alliegro, Antonio, et al.
Published: (2025)
by: Alliegro, Antonio, et al.
Published: (2025)
HiERO-StepG @ Ego4D Step Grounding Challenge: hierarchical activity understanding enables zero-shot step grounding
by: Zenotto, Andrea, et al.
Published: (2026)
by: Zenotto, Andrea, et al.
Published: (2026)
Egocentric zone-aware action recognition across environments
by: Peirone, Simone Alberto, et al.
Published: (2024)
by: Peirone, Simone Alberto, et al.
Published: (2024)
Domain Generalization using Action Sequences for Egocentric Action Recognition
by: Nasirimajd, Amirshayan, et al.
Published: (2025)
by: Nasirimajd, Amirshayan, et al.
Published: (2025)
Minerva-Ego: Spatiotemporal Hints for Egocentric Video Understanding
by: Nagrani, Arsha, et al.
Published: (2026)
by: Nagrani, Arsha, et al.
Published: (2026)
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
by: Wu, Tz-Ying, et al.
Published: (2024)
by: Wu, Tz-Ying, et al.
Published: (2024)
PEM: Prototype-based Efficient MaskFormer for Image Segmentation
by: Cavagnero, Niccolò, et al.
Published: (2024)
by: Cavagnero, Niccolò, et al.
Published: (2024)
FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding
by: Dessalene, Eadom, et al.
Published: (2026)
by: Dessalene, Eadom, et al.
Published: (2026)
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
by: Wang, Ying, et al.
Published: (2023)
by: Wang, Ying, et al.
Published: (2023)
Ego4OOD: Rethinking Egocentric Video Domain Generalization via Covariate Shift Scoring
by: Vaseqi, Zahra, et al.
Published: (2026)
by: Vaseqi, Zahra, et al.
Published: (2026)
Memory Storyboard: Leveraging Temporal Segmentation for Streaming Self-Supervised Learning from Egocentric Videos
by: Yang, Yanlai, et al.
Published: (2025)
by: Yang, Yanlai, et al.
Published: (2025)
EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video
by: Hoque, Ryan, et al.
Published: (2025)
by: Hoque, Ryan, et al.
Published: (2025)
Gradient Similarity Surgery in Multi-Task Deep Learning
by: Borsani, Thomas, et al.
Published: (2025)
by: Borsani, Thomas, et al.
Published: (2025)
$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation
by: Santos, Saul, et al.
Published: (2025)
by: Santos, Saul, et al.
Published: (2025)
3D-Aware Instance Segmentation and Tracking in Egocentric Videos
by: Bhalgat, Yash, et al.
Published: (2024)
by: Bhalgat, Yash, et al.
Published: (2024)
Being-H0.7: A Latent World-Action Model from Egocentric Videos
by: Luo, Hao, et al.
Published: (2026)
by: Luo, Hao, et al.
Published: (2026)
Understanding-Enhanced Model Collaboration for Long-Tailed Egocentric Mistake Detection
by: Han, Boyu, et al.
Published: (2026)
by: Han, Boyu, et al.
Published: (2026)
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
by: Dang, Ronghao, et al.
Published: (2025)
by: Dang, Ronghao, et al.
Published: (2025)
MM-Ego: Towards Building Egocentric Multimodal LLMs for Video QA
by: Ye, Hanrong, et al.
Published: (2024)
by: Ye, Hanrong, et al.
Published: (2024)
What to Do Next? Memorizing skills from Egocentric Instructional Video
by: Bi, Jing, et al.
Published: (2025)
by: Bi, Jing, et al.
Published: (2025)
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
by: Liang, Zhixuan, et al.
Published: (2023)
by: Liang, Zhixuan, et al.
Published: (2023)
Whole-Body Conditioned Egocentric Video Prediction
by: Bai, Yutong, et al.
Published: (2025)
by: Bai, Yutong, et al.
Published: (2025)
Understanding Domain Generalization: A Noise Robustness Perspective
by: Qiao, Rui, et al.
Published: (2024)
by: Qiao, Rui, et al.
Published: (2024)
EgoMAGIC- An Egocentric Video Field Medicine Dataset for Training Perception Algorithms
by: VanVoorst, Brian, et al.
Published: (2026)
by: VanVoorst, Brian, et al.
Published: (2026)
Online Video Understanding: OVBench and VideoChat-Online
by: Huang, Zhenpeng, et al.
Published: (2024)
by: Huang, Zhenpeng, et al.
Published: (2024)
The revenge of BiSeNet: Efficient Multi-Task Image Segmentation
by: Rosi, Gabriele, et al.
Published: (2024)
by: Rosi, Gabriele, et al.
Published: (2024)
Video Parallel Scaling: Aggregating Diverse Frame Subsets for VideoLLMs
by: Chung, Hyungjin, et al.
Published: (2025)
by: Chung, Hyungjin, et al.
Published: (2025)
Understanding Task Transfer in Vision-Language Models
by: Sachdeva, Bhuvan, et al.
Published: (2025)
by: Sachdeva, Bhuvan, et al.
Published: (2025)
PEDESTRIAN: An Egocentric Vision Dataset for Obstacle Detection on Pavements
by: Thoma, Marios, et al.
Published: (2025)
by: Thoma, Marios, et al.
Published: (2025)
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
by: Darjana, Nathan, et al.
Published: (2025)
by: Darjana, Nathan, et al.
Published: (2025)
EasyVideoR1: Easier RL for Video Understanding
by: Qin, Chuanyu, et al.
Published: (2026)
by: Qin, Chuanyu, et al.
Published: (2026)
Towards Sparse Video Understanding and Reasoning
by: Xu, Chenwei, et al.
Published: (2026)
by: Xu, Chenwei, et al.
Published: (2026)
Agentic Very Long Video Understanding
by: Rege, Aniket, et al.
Published: (2026)
by: Rege, Aniket, et al.
Published: (2026)
EgoCogNav: Cognition-aware Human Egocentric Navigation
by: Qiu, Zhiwen, et al.
Published: (2025)
by: Qiu, Zhiwen, et al.
Published: (2025)
Benchmarking Egocentric Multimodal Goal Inference for Assistive Wearable Agents
by: Veerabadran, Vijay, et al.
Published: (2025)
by: Veerabadran, Vijay, et al.
Published: (2025)
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
Advancing Egocentric Video Question Answering with Multimodal Large Language Models
by: Patel, Alkesh, et al.
Published: (2025)
by: Patel, Alkesh, et al.
Published: (2025)
Similar Items
-
Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives
by: Peirone, Simone Alberto, et al.
Published: (2025) -
Learning reusable concepts across different egocentric video understanding tasks
by: Peirone, Simone Alberto, et al.
Published: (2025) -
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
by: Peirone, Simone Alberto, et al.
Published: (2025) -
FORESCENE: FOREcasting human activity via latent SCENE graphs diffusion
by: Alliegro, Antonio, et al.
Published: (2025) -
HiERO-StepG @ Ego4D Step Grounding Challenge: hierarchical activity understanding enables zero-shot step grounding
by: Zenotto, Andrea, et al.
Published: (2026)