:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Alaa, Toqa, Mongy, Ahmad, Bakr, Assem, Diab, Mariam, Gomaa, Walid
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2410.04449
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Automated Detection of Defects on Metal Surfaces using Vision Transformers
by: Alaa, Toqa, et al.
Published: (2024)

Interpretable Aneurysm Classification via 3D Concept Bottleneck Models: Integrating Morphological and Hemodynamic Clinical Features
by: Khaled, Toqa, et al.
Published: (2026)

Prompts to Summaries: Zero-Shot Language-Guided Video Summarization with Large Language and Video Models
by: Barbara, Mario, et al.
Published: (2025)

Defense That Attacks: How Robust Models Become Better Attackers
by: Awad, Mohamed, et al.
Published: (2025)

Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding
by: Hossam, Rania, et al.
Published: (2024)

DroneVis: Versatile Computer Vision Library for Drones
by: Heakl, Ahmed, et al.
Published: (2024)

Invizo: Arabic Handwritten Document Optical Character Recognition Solution
by: Waly, Alhossien, et al.
Published: (2025)

Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
by: Oei, Keyne, et al.
Published: (2024)

Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques
by: Gowda, Shreyank N, et al.
Published: (2024)

Exploring the Role of Convolutional Neural Networks (CNN) in Dental Radiography Segmentation: A Comprehensive Systematic Literature Review
by: Brahmi, Walid, et al.
Published: (2024)

Face-GPS: A Comprehensive Technique for Quantifying Facial Muscle Dynamics in Videos
by: Kim, Juni, et al.
Published: (2024)

Video Summarization with Large Language Models
by: Lee, Min Jung, et al.
Published: (2025)

Mapping Historic Urban Footprints in France: Balancing Quality, Scalability and AI Techniques
by: Rabehi, Walid, et al.
Published: (2025)

REVEAL: Relation-based Video Representation Learning for Video-Question-Answering
by: Chaybouti, Sofian, et al.
Published: (2025)

Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques
by: Liu, Weide, et al.
Published: (2025)

VideoXum: Cross-modal Visual and Textural Summarization of Videos
by: Lin, Jingyang, et al.
Published: (2023)

PhyEduVideo: A Benchmark for Evaluating Text-to-Video Models for Physics Education
by: M, Megha Mariam K., et al.
Published: (2026)

CSTA: CNN-based Spatiotemporal Attention for Video Summarization
by: Son, Jaewon, et al.
Published: (2024)

Video Summarization using Denoising Diffusion Probabilistic Model
by: Shang, Zirui, et al.
Published: (2024)

Language-Guided Graph Representation Learning for Video Summarization
by: Li, Wenrui, et al.
Published: (2025)

Spiking Variational Graph Representation Inference for Video Summarization
by: Li, Wenrui, et al.
Published: (2025)

Uncertainty-Aware and Decoder-Aligned Learning for Video Summarization
by: Tariq, Omer, et al.
Published: (2026)

Personalized Video Summarization by Multimodal Video Understanding
by: Chen, Brian, et al.
Published: (2024)

Anchored Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models
by: Hassan, Mariam, et al.
Published: (2025)

Segment Anything for Video: A Comprehensive Review of Video Object Segmentation and Tracking from Past to Future
by: Xu, Guoping, et al.
Published: (2025)

A Comprehensive Review of Techniques, Algorithms, Advancements, Challenges, and Clinical Applications of Multi-modal Medical Image Fusion for Improved Diagnosis
by: Zubair, Muhammad, et al.
Published: (2025)

Large Model based Sequential Keyframe Extraction for Video Summarization
by: Tan, Kailong, et al.
Published: (2024)

Scaling Up Video Summarization Pretraining with Large Language Models
by: Argaw, Dawit Mureja, et al.
Published: (2024)

Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames
by: Chen, Chao, et al.
Published: (2023)

VideoSAGE: Video Summarization with Graph Representation Learning
by: Chaves, Jose M. Rojas, et al.
Published: (2024)

Enhancing Video Summarization with Context Awareness
by: Huynh-Lam, Hai-Dang, et al.
Published: (2024)

Efficient Masked Face Recognition Method during the COVID-19 Pandemic
by: Hariri, Walid
Published: (2021)

Proprio: Latent Self-Scoring and Inference-Time Refinement for Physically Plausible Video Generation
by: Hassan, Mariam, et al.
Published: (2026)

Your Interest, Your Summaries: Query-Focused Long Video Summarization
by: Patel, Nirav, et al.
Published: (2024)

SD-MVSum: Script-Driven Multimodal Video Summarization Method and Datasets
by: Mylonas, Manolis, et al.
Published: (2025)

MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment
by: wang, Shuo, et al.
Published: (2025)

From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models
by: Acuaviva, Pablo, et al.
Published: (2025)

A Human-Annotated Video Dataset for Training and Evaluation of 360-Degree Video Summarization Methods
by: Kontostathis, Ioannis, et al.
Published: (2024)

Multimodal Abstractive Summarization of Instructional Videos with Vision-Language Models
by: Nazir, Maham, et al.
Published: (2026)

Comparing Learning Paradigms for Egocentric Video Summarization
by: Wen, Daniel
Published: (2025)