:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Makiyeh, Fouad, Nguyen, Huy-Dung, Chareyre, Patrick, Hasani, Ramin, Blanchon, Marc, Rus, Daniela
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.17153
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Optical Flow Matters: an Empirical Comparative Study on Fusing Monocular Extracted Modalities for Better Steering
by: Makiyeh, Fouad, et al.
Published: (2024)

Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference
by: Nguyen, Huy-Dung, et al.
Published: (2024)

Exploring Latent Pathways: Enhancing the Interpretability of Autonomous Driving with a Variational Autoencoder
by: Bairouk, Anass, et al.
Published: (2024)

Toward Efficient Visual Gyroscopes: Spherical Moments, Harmonics Filtering, and Masking Techniques for Spherical Camera Applications
by: Du, Yao, et al.
Published: (2024)

Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks
by: Quach, Alex, et al.
Published: (2024)

ReCap: Event-Aware Image Captioning with Article Retrieval and Semantic Gaussian Normalization
by: Nguyen, Thinh-Phuc, et al.
Published: (2025)

Holistic Surgical Phase Recognition with Hierarchical Input Dependent State Space Models
by: Wu, Haoyang, et al.
Published: (2025)

Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring
by: Munoz, Mario Francisco, et al.
Published: (2024)

Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance
by: Che, Quang-Huy, et al.
Published: (2024)

Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs
by: Ma, Rong, et al.
Published: (2024)

Semi-supervised 3D Semantic Scene Completion with 2D Vision Foundation Model Guidance
by: Pham, Duc-Hai, et al.
Published: (2024)

Region-Grounded Report Generation for 3D Medical Imaging: A Fine-Grained Dataset and Graph-Enhanced Framework
by: Nguyen, Cong Huy, et al.
Published: (2026)

Deep-Wide Learning Assistance for Insect Pest Classification
by: Nguyen, Toan, et al.
Published: (2024)

IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework under Limited Annotation Scheme
by: Tran, Dinh Dai Quan, et al.
Published: (2025)

MMAP: A Multi-Magnification and Prototype-Aware Architecture for Predicting Spatial Gene Expression
by: Nguyen, Hai Dang, et al.
Published: (2025)

Med-StepBench: A Hierarchical Reasoning Framework for Evaluating Hallucinations in Medical Vision-Language Models
by: Nguyen, Minh Khoi, et al.
Published: (2026)

Perception-Aware Multimodal Spatial Reasoning from Monocular Images
by: Cheng, Yanchun, et al.
Published: (2026)

Multimodal Contextualized Support for Enhancing Video Retrieval System
by: Nguyen-Le, Quoc-Bao, et al.
Published: (2024)

Generative AI for Vision: A Comprehensive Study of Frameworks and Applications
by: Bousetouane, Fouad
Published: (2025)

Mono3DV: Monocular 3D Object Detection with 3D-Aware Bipartite Matching and Variational Query DeNoising
by: Vu, Kiet Dang, et al.
Published: (2026)

The Quest for Universal Master Key Filters in DS-CNNs
by: Babaiee, Zahra, et al.
Published: (2025)

Learning autonomous driving from aerial imagery
by: Murali, Varun, et al.
Published: (2024)

Generating Out-Of-Distribution Scenarios Using Language Models
by: Aasi, Erfan, et al.
Published: (2024)

MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
by: Pham, Trong-Thang, et al.
Published: (2026)

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination
by: Che, Quang-Huy, et al.
Published: (2024)

A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions
by: Le, Anh, et al.
Published: (2025)

Surface Normal Estimation with Transformers
by: Hu, Barry Shichen, et al.
Published: (2024)

PARDON: Privacy-Aware and Robust Federated Domain Generalization
by: Nguyen, Dung Thuy, et al.
Published: (2024)

FA-Seg: A Fast and Accurate Diffusion-Based Method for Open-Vocabulary Segmentation
by: Che, Huy, et al.
Published: (2025)

AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification
by: Nguyen, Huy, et al.
Published: (2024)

Beyond Standard Benchmarks: A Systematic Audit of Vision-Language Model's Robustness to Natural Semantic Variation Across Diverse Tasks
by: Chengyu, Jia, et al.
Published: (2026)

MOOSE: Pay Attention to Temporal Dynamics for Video Understanding via Optical Flows
by: Nguyen, Hong, et al.
Published: (2025)

CAKE: Real-time Action Detection via Motion Distillation and Background-aware Contrastive Learning
by: Hoang, Hieu, et al.
Published: (2026)

GNN-MoE: Context-Aware Patch Routing using GNNs for Parameter-Efficient Domain Generalization
by: Soliman, Mahmoud, et al.
Published: (2025)

VisionGuard: Synergistic Framework for Helmet Violation Detection
by: Nguyen, Lam-Huy, et al.
Published: (2025)

VRAE: Vertical Residual Autoencoder for License Plate Denoising and Deblurring
by: Nguyen, Cuong, et al.
Published: (2025)

Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
by: Yin, Lianhao, et al.
Published: (2024)

Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models
by: Babaiee, Zahra, et al.
Published: (2025)

The Master Key Filters Hypothesis: Deep Filters Are General
by: Babaiee, Zahra, et al.
Published: (2024)

CovHuSeg: An Enhanced Approach for Kidney Pathology Segmentation
by: Trinh, Huy, et al.
Published: (2024)