Saved in:
| Main Authors: | Ruta, Dymitr, Mio, Corrado, Damiani, Ernesto |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.08024 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Predictive Modeling of Maritime Radar Data Using Transformer Architecture
by: Qesaraku, Bjorna, et al.
Published: (2025)
by: Qesaraku, Bjorna, et al.
Published: (2025)
Rethinking Visual Intelligence: Insights from Video Pretraining
by: Acuaviva, Pablo, et al.
Published: (2025)
by: Acuaviva, Pablo, et al.
Published: (2025)
Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
by: Kang, Xueyang, et al.
Published: (2026)
by: Kang, Xueyang, et al.
Published: (2026)
A deep learning approach to track eye movements based on events
by: Seth, Chirag, et al.
Published: (2025)
by: Seth, Chirag, et al.
Published: (2025)
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
by: Korolkov, Vasilii
Published: (2025)
by: Korolkov, Vasilii
Published: (2025)
TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR
by: Lentsch, Ted, et al.
Published: (2026)
by: Lentsch, Ted, et al.
Published: (2026)
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
by: Lentsch, Ted, et al.
Published: (2024)
by: Lentsch, Ted, et al.
Published: (2024)
Learning Association via Track-Detection Matching for Multi-Object Tracking
by: Adžemović, Momir
Published: (2025)
by: Adžemović, Momir
Published: (2025)
Salient Concept-Aware Generative Data Augmentation
by: Zhao, Tianchen, et al.
Published: (2025)
by: Zhao, Tianchen, et al.
Published: (2025)
Tricks and Plug-ins for Gradient Boosting in Image Classification
by: Fang, Biyi, et al.
Published: (2025)
by: Fang, Biyi, et al.
Published: (2025)
See What You Need: Query-Aware Visual Intelligence through Reasoning-Perception Loops
by: Dong, Zixuan, et al.
Published: (2025)
by: Dong, Zixuan, et al.
Published: (2025)
WildfireVLM: AI-powered Analysis for Early Wildfire Detection and Risk Assessment Using Satellite Imagery
by: Ayanzadeh, Aydin, et al.
Published: (2026)
by: Ayanzadeh, Aydin, et al.
Published: (2026)
IMKD: Intensity-Aware Multi-Level Knowledge Distillation for Camera-Radar Fusion
by: Mishra, Shashank, et al.
Published: (2025)
by: Mishra, Shashank, et al.
Published: (2025)
Objaverse++: Curated 3D Object Dataset with Quality Annotations
by: Lin, Chendi, et al.
Published: (2025)
by: Lin, Chendi, et al.
Published: (2025)
Wafer Map Defect Classification Using Autoencoder-Based Data Augmentation and Convolutional Neural Network
by: Bao, Yin-Yin, et al.
Published: (2024)
by: Bao, Yin-Yin, et al.
Published: (2024)
VideoMind: An Omni-Modal Video Dataset with Intent Grounding for Deep-Cognitive Video Understanding
by: Yang, Baoyao, et al.
Published: (2025)
by: Yang, Baoyao, et al.
Published: (2025)
CellARC: Measuring Intelligence with Cellular Automata
by: Lžičař, Miroslav
Published: (2025)
by: Lžičař, Miroslav
Published: (2025)
Deep Learning-Based Multi-Object Tracking: A Comprehensive Survey from Foundations to State-of-the-Art
by: Adžemović, Momir
Published: (2025)
by: Adžemović, Momir
Published: (2025)
MVTamperBench: Evaluating Robustness of Vision-Language Models
by: Agarwal, Amit, et al.
Published: (2024)
by: Agarwal, Amit, et al.
Published: (2024)
Polarization-Based Eye Tracking with Personalized Siamese Architectures
by: Kalkanli, Beyza, et al.
Published: (2026)
by: Kalkanli, Beyza, et al.
Published: (2026)
Revisiting Energy-Based Model for Out-of-Distribution Detection
by: Wu, Yifan, et al.
Published: (2024)
by: Wu, Yifan, et al.
Published: (2024)
Classifier Calibration at Scale: An Empirical Study of Model-Agnostic Post-Hoc Methods
by: Manokhin, Valery, et al.
Published: (2026)
by: Manokhin, Valery, et al.
Published: (2026)
DeepShade: Enable Shade Simulation by Text-conditioned Image Generation
by: Da, Longchao, et al.
Published: (2025)
by: Da, Longchao, et al.
Published: (2025)
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax
by: Nauen, Tobias Christian, et al.
Published: (2024)
by: Nauen, Tobias Christian, et al.
Published: (2024)
Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction
by: Artru, Noé, et al.
Published: (2026)
by: Artru, Noé, et al.
Published: (2026)
The Geometry of Cortical Computation: Manifold Disentanglement and Predictive Dynamics in VCNet
by: Hill, Brennen A., et al.
Published: (2025)
by: Hill, Brennen A., et al.
Published: (2025)
Smelly, dense, and spreaded: The Object Detection for Olfactory References (ODOR) dataset
by: Zinnen, Mathias, et al.
Published: (2025)
by: Zinnen, Mathias, et al.
Published: (2025)
Predicting When to Trust Vision-Language Models for Spatial Reasoning
by: Imran, Muhammad, et al.
Published: (2026)
by: Imran, Muhammad, et al.
Published: (2026)
Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
by: Zhang, Junbin, et al.
Published: (2022)
by: Zhang, Junbin, et al.
Published: (2022)
VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models
by: Su, Yuetong, et al.
Published: (2025)
by: Su, Yuetong, et al.
Published: (2025)
Automatic Detection of Intro and Credits in Video using CLIP and Multihead Attention
by: Korolkov, Vasilii, et al.
Published: (2025)
by: Korolkov, Vasilii, et al.
Published: (2025)
Extraction Of Cumulative Blobs From Dynamic Gestures
by: Naulakha, Rishabh, et al.
Published: (2025)
by: Naulakha, Rishabh, et al.
Published: (2025)
Dense Video Understanding with Gated Residual Tokenization
by: Zhang, Haichao, et al.
Published: (2025)
by: Zhang, Haichao, et al.
Published: (2025)
Motion Attribution for Video Generation
by: Wu, Xindi, et al.
Published: (2026)
by: Wu, Xindi, et al.
Published: (2026)
Heart Failure Prediction using Modal Decomposition and Masked Autoencoders for Scarce Echocardiography Databases
by: Bell-Navas, Andrés, et al.
Published: (2025)
by: Bell-Navas, Andrés, et al.
Published: (2025)
FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios
by: Li, Tianle, et al.
Published: (2025)
by: Li, Tianle, et al.
Published: (2025)
YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
by: Impraimakis, Marios, et al.
Published: (2026)
by: Impraimakis, Marios, et al.
Published: (2026)
Learning Sign Language Representation using CNN LSTM, 3DCNN, CNN RNN LSTM and CCN TD
by: Louison, Nikita, et al.
Published: (2024)
by: Louison, Nikita, et al.
Published: (2024)
TACIT Benchmark: A Programmatic Visual Reasoning Benchmark for Generative and Discriminative Models
by: Medeiros, Daniel Nobrega
Published: (2026)
by: Medeiros, Daniel Nobrega
Published: (2026)
Exploiting Precision Mapping and Component-Specific Feature Enhancement for Breast Cancer Segmentation and Identification
by: V, Pandiyaraju, et al.
Published: (2024)
by: V, Pandiyaraju, et al.
Published: (2024)
Similar Items
-
Predictive Modeling of Maritime Radar Data Using Transformer Architecture
by: Qesaraku, Bjorna, et al.
Published: (2025) -
Rethinking Visual Intelligence: Insights from Video Pretraining
by: Acuaviva, Pablo, et al.
Published: (2025) -
Hierarchical Point-Patch Fusion with Adaptive Patch Codebook for 3D Shape Anomaly Detection
by: Kang, Xueyang, et al.
Published: (2026) -
A deep learning approach to track eye movements based on events
by: Seth, Chirag, et al.
Published: (2025) -
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
by: Korolkov, Vasilii
Published: (2025)