:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ruta, Dymitr, Mio, Corrado, Damiani, Ernesto
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Machine Learning 68U05, 68T45, 92C80, 28A80 I.3.5; I.2.10; I.4.8; I.5.1; J.2
Online Access:	https://arxiv.org/abs/2411.08024
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Predictive Modeling of Maritime Radar Data Using Transformer Architecture
by: Qesaraku, Bjorna, et al.
Published: (2025)

Rethinking Visual Intelligence: Insights from Video Pretraining
by: Acuaviva, Pablo, et al.
Published: (2025)

Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
by: Kang, Xueyang, et al.
Published: (2026)

A deep learning approach to track eye movements based on events
by: Seth, Chirag, et al.
Published: (2025)

Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
by: Korolkov, Vasilii
Published: (2025)

TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR
by: Lentsch, Ted, et al.
Published: (2026)

UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
by: Lentsch, Ted, et al.
Published: (2024)

Learning Association via Track-Detection Matching for Multi-Object Tracking
by: Adžemović, Momir
Published: (2025)

Salient Concept-Aware Generative Data Augmentation
by: Zhao, Tianchen, et al.
Published: (2025)

Tricks and Plug-ins for Gradient Boosting in Image Classification
by: Fang, Biyi, et al.
Published: (2025)

See What You Need: Query-Aware Visual Intelligence through Reasoning-Perception Loops
by: Dong, Zixuan, et al.
Published: (2025)

WildfireVLM: AI-powered Analysis for Early Wildfire Detection and Risk Assessment Using Satellite Imagery
by: Ayanzadeh, Aydin, et al.
Published: (2026)

IMKD: Intensity-Aware Multi-Level Knowledge Distillation for Camera-Radar Fusion
by: Mishra, Shashank, et al.
Published: (2025)

Objaverse++: Curated 3D Object Dataset with Quality Annotations
by: Lin, Chendi, et al.
Published: (2025)

Wafer Map Defect Classification Using Autoencoder-Based Data Augmentation and Convolutional Neural Network
by: Bao, Yin-Yin, et al.
Published: (2024)

VideoMind: An Omni-Modal Video Dataset with Intent Grounding for Deep-Cognitive Video Understanding
by: Yang, Baoyao, et al.
Published: (2025)

CellARC: Measuring Intelligence with Cellular Automata
by: Lžičař, Miroslav
Published: (2025)

Deep Learning-Based Multi-Object Tracking: A Comprehensive Survey from Foundations to State-of-the-Art
by: Adžemović, Momir
Published: (2025)

MVTamperBench: Evaluating Robustness of Vision-Language Models
by: Agarwal, Amit, et al.
Published: (2024)

Polarization-Based Eye Tracking with Personalized Siamese Architectures
by: Kalkanli, Beyza, et al.
Published: (2026)

Revisiting Energy-Based Model for Out-of-Distribution Detection
by: Wu, Yifan, et al.
Published: (2024)

Classifier Calibration at Scale: An Empirical Study of Model-Agnostic Post-Hoc Methods
by: Manokhin, Valery, et al.
Published: (2026)

DeepShade: Enable Shade Simulation by Text-conditioned Image Generation
by: Da, Longchao, et al.
Published: (2025)

TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
by: Nauen, Tobias Christian, et al.
Published: (2024)

Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction
by: Artru, Noé, et al.
Published: (2026)

The Geometry of Cortical Computation: Manifold Disentanglement and Predictive Dynamics in VCNet
by: Hill, Brennen A., et al.
Published: (2025)

Smelly, dense, and spreaded: The Object Detection for Olfactory References (ODOR) dataset
by: Zinnen, Mathias, et al.
Published: (2025)

Predicting When to Trust Vision-Language Models for Spatial Reasoning
by: Imran, Muhammad, et al.
Published: (2026)

Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
by: Zhang, Junbin, et al.
Published: (2022)

VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models
by: Su, Yuetong, et al.
Published: (2025)

Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
by: Korolkov, Vasilii, et al.
Published: (2025)

Extraction Of Cumulative Blobs From Dynamic Gestures
by: Naulakha, Rishabh, et al.
Published: (2025)

Dense Video Understanding with Gated Residual Tokenization
by: Zhang, Haichao, et al.
Published: (2025)

Motion Attribution for Video Generation
by: Wu, Xindi, et al.
Published: (2026)

Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
by: Bell-Navas, Andrés, et al.
Published: (2025)

FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios
by: Li, Tianle, et al.
Published: (2025)

YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
by: Impraimakis, Marios, et al.
Published: (2026)

Learning Sign Language Representation using CNN LSTM, 3DCNN, CNN RNN LSTM and CCN TD
by: Louison, Nikita, et al.
Published: (2024)

TACIT Benchmark: A Programmatic Visual Reasoning Benchmark for Generative and Discriminative Models
by: Medeiros, Daniel Nobrega
Published: (2026)

Exploiting Precision Mapping and Component-Specific Feature Enhancement for Breast Cancer Segmentation and Identification
by: V, Pandiyaraju, et al.
Published: (2024)