:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Revankar, Kush, Deshpande, Shreyas, Sayeed, Araham, Tandale, Ansh, Bobde, Sarika
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computer Vision and Pattern Recognition
Online-Zugang:	https://arxiv.org/abs/2512.06485
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Scale-Aware Recognition in Satellite Images under Resource Constraints
von: Revankar, Shreelekha, et al.
Veröffentlicht: (2024)

MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing
von: Revankar, Shreelekha, et al.
Veröffentlicht: (2025)

Neural Radiance Fields: Past, Present, and Future
von: Mittal, Ansh
Veröffentlicht: (2023)

A Comparison of Lightweight Deep Learning Models for Particulate-Matter Nowcasting in the Indian Subcontinent & Surrounding Regions
von: Kushwaha, Ansh, et al.
Veröffentlicht: (2025)

Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice
von: Bohy, Hugo, et al.
Veröffentlicht: (2025)

SHARE: Single-view Human Adversarial REconstruction
von: Revankar, Shreelekha, et al.
Veröffentlicht: (2023)

ViKANformer: Embedding Kolmogorov Arnold Networks in Vision Transformers for Pattern-Based Learning
von: S, Shreyas, et al.
Veröffentlicht: (2025)

Interpretable Underwater Diver Gesture Recognition
von: Mangalvedhekar, Sudeep, et al.
Veröffentlicht: (2023)

IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence
von: Chandgothia, Shreyas, et al.
Veröffentlicht: (2024)

Rethinking the Threat and Accessibility of Adversarial Attacks against Face Recognition Systems
von: Cao, Yuxin, et al.
Veröffentlicht: (2024)

Next-Frame Feature Prediction for Multimodal Deepfake Detection and Temporal Localization
von: Anshul, Ashutosh, et al.
Veröffentlicht: (2025)

SuperEx: Enhancing Indoor Mapping and Exploration using Non-Line-of-Sight Perception
von: Garg, Kush, et al.
Veröffentlicht: (2025)

Enhancing Underwater Object Detection through Spatio-Temporal Analysis and Spatial Attention Networks
von: Karri, Sai Likhith, et al.
Veröffentlicht: (2025)

Cattle-CLIP: A Multimodal Framework for Cattle Behaviour Recognition from Video
von: Liu, Huimin, et al.
Veröffentlicht: (2025)

MCIHN: A Hybrid Network Model Based on Multi-path Cross-modal Interaction for Multimodal Emotion Recognition
von: Zhang, Haoyang, et al.
Veröffentlicht: (2025)

Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition
von: Shahgir, Haz Sameen, et al.
Veröffentlicht: (2024)

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning
von: Sadik, Md Nahid, et al.
Veröffentlicht: (2024)

Towards Visual Syntactical Understanding
von: Chowdhury, Sayeed Shafayet, et al.
Veröffentlicht: (2024)

UniMPR: A Unified Framework for Multimodal Place Recognition with Heterogeneous Sensor Configurations
von: Qi, Zhangshuo, et al.
Veröffentlicht: (2025)

Adaptive Sensitivity Analysis for Robust Augmentation against Natural Corruptions in Image Segmentation
von: Zheng, Laura, et al.
Veröffentlicht: (2024)

Leveraging Foundation Models for Multimodal Graph-Based Action Recognition
von: Ziaeetabar, Fatemeh, et al.
Veröffentlicht: (2025)

Graph-Based Multimodal and Multi-view Alignment for Keystep Recognition
von: Romero, Julia Lee, et al.
Veröffentlicht: (2025)

SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition
von: Liu, Hongda, et al.
Veröffentlicht: (2025)

Explicit Interaction for Fusion-Based Place Recognition
von: Xu, Jingyi, et al.
Veröffentlicht: (2024)

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
von: Wang, Mengmeng, et al.
Veröffentlicht: (2024)

A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product
von: Xiang, Ao, et al.
Veröffentlicht: (2024)

A Trustworthy Method for Multimodal Emotion Recognition
von: Xue, Junxiao, et al.
Veröffentlicht: (2025)

U-Mind: A Unified Framework for Real-Time Multimodal Interaction with Audiovisual Generation
von: Deng, Xiang, et al.
Veröffentlicht: (2026)

in-Car Biometrics (iCarB) Datasets for Driver Recognition: Face, Fingerprint, and Voice
von: Hahn, Vedrana Krivokuca, et al.
Veröffentlicht: (2024)

GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance
von: Kim, Junhyeok, et al.
Veröffentlicht: (2025)

TiCAL:Typicality-Based Consistency-Aware Learning for Multimodal Emotion Recognition
von: Yin, Wen, et al.
Veröffentlicht: (2025)

Video Emotion Open-vocabulary Recognition Based on Multimodal Large Language Model
von: Ge, Mengying, et al.
Veröffentlicht: (2024)

LLandMark: A Multi-Agent Framework for Landmark-Aware Multimodal Interactive Video Retrieval
von: Phung, Minh-Chi, et al.
Veröffentlicht: (2026)

Col-OLHTR: A Novel Framework for Multimodal Online Handwritten Text Recognition
von: Liu, Chenyu, et al.
Veröffentlicht: (2025)

Voice-Assisted Real-Time Traffic Sign Recognition System Using Convolutional Neural Network
von: Manawadu, Mayura, et al.
Veröffentlicht: (2024)

Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion Recognition
von: Liu, Ran, et al.
Veröffentlicht: (2025)

MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation
von: Shah, Ansh, et al.
Veröffentlicht: (2024)

Fuse after Align: Improving Face-Voice Association Learning via Multimodal Encoder
von: Peng, Chong, et al.
Veröffentlicht: (2024)

Leveraging CLIP Encoder for Multimodal Emotion Recognition
von: Song, Yehun, et al.
Veröffentlicht: (2025)

Decoupled Hierarchical Distillation for Multimodal Emotion Recognition
von: Li, Yong, et al.
Veröffentlicht: (2026)