Saved in:
| Main Authors: | Mehta, Naval Kishore, Arvind, Kumar, Himanshu, Banerjee, Abeer, Saurav, Sumeet, Singh, Sanjay |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.05936 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Optimizing Multitask Industrial Processes with Predictive Action Guidance
by: Mehta, Naval Kishore, et al.
Published: (2025)
by: Mehta, Naval Kishore, et al.
Published: (2025)
Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks
by: Banerjee, Abeer, et al.
Published: (2024)
by: Banerjee, Abeer, et al.
Published: (2024)
Towards Lensless Image Deblurring with Prior-Embedded Implicit Neural Representations in the Low-Data Regime
by: Banerjee, Abeer, et al.
Published: (2024)
by: Banerjee, Abeer, et al.
Published: (2024)
Towards Physics-informed Cyclic Adversarial Multi-PSF Lensless Imaging
by: Banerjee, Abeer, et al.
Published: (2024)
by: Banerjee, Abeer, et al.
Published: (2024)
GLOFNet -- A Multimodal Dataset for GLOF Monitoring and Prediction
by: Fatima, Zuha, et al.
Published: (2025)
by: Fatima, Zuha, et al.
Published: (2025)
HQ-JEPA: Hybrid Quantum Joint-Embedding Predictive Architecture for Cross-Modal Remote Sensing Representation Learning
by: Hossain, Md Aminur, et al.
Published: (2026)
by: Hossain, Md Aminur, et al.
Published: (2026)
SKoPe3D: A Synthetic Dataset for Vehicle Keypoint Perception in 3D from Traffic Monitoring Cameras
by: Pahadia, Himanshu, et al.
Published: (2023)
by: Pahadia, Himanshu, et al.
Published: (2023)
Engagement Prediction of Short Videos with Large Multimodal Models
by: Sun, Wei, et al.
Published: (2025)
by: Sun, Wei, et al.
Published: (2025)
SmartWilds: Multimodal Wildlife Monitoring Dataset
by: Kline, Jenna, et al.
Published: (2025)
by: Kline, Jenna, et al.
Published: (2025)
Enhancing Saliency Prediction in Monitoring Tasks: The Role of Visual Highlights
by: Wu, Zekun, et al.
Published: (2024)
by: Wu, Zekun, et al.
Published: (2024)
Multimodal Fusion of Glucose Monitoring and Food Imagery for Caloric Content Prediction
by: Kumar, Adarsh
Published: (2025)
by: Kumar, Adarsh
Published: (2025)
iOSPointMapper: RealTime Pedestrian and Accessibility Mapping with Mobile AI
by: Naidu, Himanshu, et al.
Published: (2025)
by: Naidu, Himanshu, et al.
Published: (2025)
GAViD: A Large-Scale Multimodal Dataset for Context-Aware Group Affect Recognition from Videos
by: Kumar, Deepak, et al.
Published: (2026)
by: Kumar, Deepak, et al.
Published: (2026)
Tracking by Predicting 3-D Gaussians Over Time
by: Baranwal, Tanish, et al.
Published: (2025)
by: Baranwal, Tanish, et al.
Published: (2025)
CrossMed: A Multimodal Cross-Task Benchmark for Compositional Generalization in Medical Imaging
by: Singh, Pooja, et al.
Published: (2025)
by: Singh, Pooja, et al.
Published: (2025)
GradAttn: Replacing Fixed Residual Connections with Task-Modulated Attention Pathways
by: Ghoshal, Soudeep, et al.
Published: (2026)
by: Ghoshal, Soudeep, et al.
Published: (2026)
Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development
by: Sahoo, Pranab, et al.
Published: (2024)
by: Sahoo, Pranab, et al.
Published: (2024)
VisioPhysioENet: Visual Physiological Engagement Detection Network
by: Singh, Alakhsimar, et al.
Published: (2024)
by: Singh, Alakhsimar, et al.
Published: (2024)
D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning -- A Benchmark Dataset and Method
by: Kasu, Sai Kartheek Reddy, et al.
Published: (2025)
by: Kasu, Sai Kartheek Reddy, et al.
Published: (2025)
HOH: Markerless Multimodal Human-Object-Human Handover Dataset with Large Object Count
by: Wiederhold, Noah, et al.
Published: (2023)
by: Wiederhold, Noah, et al.
Published: (2023)
MBE-ARI: A Multimodal Dataset Mapping Bi-directional Engagement in Animal-Robot Interaction
by: Noronha, Ian, et al.
Published: (2025)
by: Noronha, Ian, et al.
Published: (2025)
Towards Open-Vocabulary Industrial Defect Understanding with a Large-Scale Multimodal Dataset
by: Ni, TsaiChing, et al.
Published: (2025)
by: Ni, TsaiChing, et al.
Published: (2025)
Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
by: Kar, Indrajit, et al.
Published: (2025)
by: Kar, Indrajit, et al.
Published: (2025)
OpenMarcie: Dataset for Multimodal Action Recognition in Industrial Environments
by: Bello, Hymalai, et al.
Published: (2026)
by: Bello, Hymalai, et al.
Published: (2026)
DAOS: A Multimodal In-cabin Behavior Monitoring with Driver Action-Object Synergy Dataset
by: Li, Yiming, et al.
Published: (2026)
by: Li, Yiming, et al.
Published: (2026)
Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy
by: Pattnayak, Priyaranjan, et al.
Published: (2024)
by: Pattnayak, Priyaranjan, et al.
Published: (2024)
fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models
by: Sharma, Saurav, et al.
Published: (2025)
by: Sharma, Saurav, et al.
Published: (2025)
Tuned Reverse Distillation: Enhancing Multimodal Industrial Anomaly Detection with Crossmodal Tuners
by: Liu, Xinyue, et al.
Published: (2024)
by: Liu, Xinyue, et al.
Published: (2024)
A Computational Model of Message Sensation Value in Short Video Multimodal Features that Predicts Sensory and Behavioral Engagement
by: Xue, Haoning, et al.
Published: (2026)
by: Xue, Haoning, et al.
Published: (2026)
A Novel Multimodal System to Predict Agitation in People with Dementia Within Clinical Settings: A Proof of Concept
by: Badawi, Abeer, et al.
Published: (2024)
by: Badawi, Abeer, et al.
Published: (2024)
Leveraging Perceptual Scores for Dataset Pruning in Computer Vision Tasks
by: Singh, Raghavendra
Published: (2024)
by: Singh, Raghavendra
Published: (2024)
RecruitView: A Multimodal Dataset for Predicting Personality and Interview Performance for Human Resources Applications
by: Gupta, Amit Kumar, et al.
Published: (2025)
by: Gupta, Amit Kumar, et al.
Published: (2025)
IJmond Industrial Smoke Segmentation Dataset
by: Hsu, Yen-Chia, et al.
Published: (2026)
by: Hsu, Yen-Chia, et al.
Published: (2026)
Learning to Weigh Waste: A Physics-Informed Multimodal Fusion Framework and Large-Scale Dataset for Commercial and Industrial Applications
by: Islam, Md. Adnanul, et al.
Published: (2026)
by: Islam, Md. Adnanul, et al.
Published: (2026)
Historical Printed Ornaments: Dataset and Tasks
by: Chaki, Sayan Kumar, et al.
Published: (2024)
by: Chaki, Sayan Kumar, et al.
Published: (2024)
Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification
by: Kumar, Raja, et al.
Published: (2024)
by: Kumar, Raja, et al.
Published: (2024)
Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary Task Integration
by: Khurshid, Mahapara, et al.
Published: (2024)
by: Khurshid, Mahapara, et al.
Published: (2024)
Multimodal Event Detection: Current Approaches and Defining the New Playground through LLMs and VLMs
by: Dey, Abhishek, et al.
Published: (2025)
by: Dey, Abhishek, et al.
Published: (2025)
State-Change Learning for Prediction of Future Events in Endoscopic Videos
by: Sharma, Saurav, et al.
Published: (2025)
by: Sharma, Saurav, et al.
Published: (2025)
ADAS-TO: A Large-Scale Multimodal Naturalistic Dataset and Empirical Characterization of Human Takeovers during ADAS Engagement
by: Wang, Yuhang, et al.
Published: (2026)
by: Wang, Yuhang, et al.
Published: (2026)
Similar Items
-
Optimizing Multitask Industrial Processes with Predictive Action Guidance
by: Mehta, Naval Kishore, et al.
Published: (2025) -
Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks
by: Banerjee, Abeer, et al.
Published: (2024) -
Towards Lensless Image Deblurring with Prior-Embedded Implicit Neural Representations in the Low-Data Regime
by: Banerjee, Abeer, et al.
Published: (2024) -
Towards Physics-informed Cyclic Adversarial Multi-PSF Lensless Imaging
by: Banerjee, Abeer, et al.
Published: (2024) -
GLOFNet -- A Multimodal Dataset for GLOF Monitoring and Prediction
by: Fatima, Zuha, et al.
Published: (2025)