:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sarkar, Pritisha, Saha, Duranta Durbaar Vishal, Saha, Mousumi
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.03499
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Real‐Time Air Quality Index Detection through Regression‐Based Convolutional Neural Network Model on Captured Images
by: Pritisha Sarkar, et al.
Published: (2024)

An End-to-End Deep Learning Framework for Arsenicosis Diagnosis Using Mobile-Captured Skin Images
by: Newaz, Asif, et al.
Published: (2025)

Investigating Deep Learning Models for Ejection Fraction Estimation from Echocardiography Videos
by: Saranyan, Shravan, et al.
Published: (2025)

The Promise of Analog Deep Learning: Recent Advances, Challenges and Opportunities
by: Datar, Aditya, et al.
Published: (2024)

When Words Can't Capture It All: Towards Video-Based User Complaint Text Generation with Multimodal Video Complaint Dataset
by: Das, Sarmistha, et al.
Published: (2025)

VALUED -- Vision and Logical Understanding Evaluation Dataset
by: Saha, Soumadeep, et al.
Published: (2023)

Evaluation of State-of-the-Art Deep Learning Techniques for Plant Disease and Pest Detection
by: Banerjee, Saptarshi, et al.
Published: (2025)

Externally Validated Multi-Task Learning via Consistency Regularization Using Differentiable BI-RADS Features for Breast Ultrasound Tumor Segmentation
by: Zhang, Jingru, et al.
Published: (2025)

Finding Regions of Interest in Whole Slide Images Using Multiple Instance Learning
by: Afonso, Martim, et al.
Published: (2024)

A Critical Study on Tea Leaf Disease Detection using Deep Learning Techniques
by: Borah, Nabajyoti, et al.
Published: (2025)

Deep Unlearning: Fast and Efficient Gradient-free Approach to Class Forgetting
by: Kodge, Sangamesh, et al.
Published: (2023)

FedMRL: Data Heterogeneity Aware Federated Multi-agent Deep Reinforcement Learning for Medical Imaging
by: Sahoo, Pranab, et al.
Published: (2024)

ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
by: Smith, Ethan, et al.
Published: (2024)

Personalized Image Generation from an Author Writing Style
by: Gandhi, Sagar, et al.
Published: (2025)

Toward Accessible Dermatology: Skin Lesion Classification Using Deep Learning Models on Mobile-Acquired Images
by: Newaz, Asif, et al.
Published: (2025)

Comparing and Integrating Different Notions of Representational Correspondence in Neural Systems
by: Wu, Jialin, et al.
Published: (2025)

Predicting Road Crossing Behaviour using Pose Detection and Sequence Modelling
by: Dasgupta, Subhasis, et al.
Published: (2025)

Multi-Attention Stacked Ensemble for Lung Cancer Detection in CT Scans
by: Saha, Uzzal, et al.
Published: (2025)

ADP-FL-MedSeg: Adaptive Differential Privacy for Federated Medical Segmentation Across Diverse Modalities
by: Saha, Puja, et al.
Published: (2026)

Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching
by: Yew, Wei Chee, et al.
Published: (2025)

RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration
by: Rajagopalan, Sudarshan, et al.
Published: (2025)

ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos
by: Maity, Krishanu, et al.
Published: (2024)

Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos
by: Das, Sarmistha, et al.
Published: (2025)

CLARIFY: A Specialist-Generalist Framework for Accurate and Lightweight Dermatological Visual Question Answering
by: Saha, Aranya, et al.
Published: (2025)

Panoptic Segmentation and Labelling of Lumbar Spine Vertebrae using Modified Attention Unet
by: Pal, Rikathi, et al.
Published: (2024)

Exploring Explainability in Video Action Recognition
by: Saha, Avinab, et al.
Published: (2024)

Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training
by: Saha, Rohan, et al.
Published: (2024)

Deep Neural Networks Can Learn Generalizable Same-Different Visual Relations
by: Tartaglini, Alexa R., et al.
Published: (2023)

Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation
by: Kim, Joong Ho, et al.
Published: (2026)

TruthLens: Visual Grounding for Universal DeepFake Reasoning
by: Kundu, Rohit, et al.
Published: (2025)

Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation
by: Mansour, Elham Amin, et al.
Published: (2024)

Learning Novel View Synthesis from Heterogeneous Low-light Captures
by: Zheng, Quan, et al.
Published: (2024)

DeepMerge: Deep-Learning-Based Region-Merging for Image Segmentation
by: Lv, Xianwei, et al.
Published: (2023)

Generative Quanta Color Imaging
by: Purohit, Vishal, et al.
Published: (2024)

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions
by: Ghosh, Akash, et al.
Published: (2024)

RL-I2IT: Image-to-Image Translation with Deep Reinforcement Learning
by: Hu, Jing, et al.
Published: (2023)

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?
by: Han, Yujin, et al.
Published: (2025)

FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning
by: Saha, Pramit, et al.
Published: (2024)

Artery-Vein Segmentation from Fundus Images using Deep Learning
by: SK, Sharan, et al.
Published: (2025)

Recognition of Harmful Phytoplankton from Microscopic Images using Deep Learning
by: Khaldi, Aymane, et al.
Published: (2024)