:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Khan, Omer Jauhar, Khan, Sudais, Anwar, Hafeez, Khan, Shahzeb, Arifeen, Shams Ul, Ullah, Farman
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Computer Vision and Pattern Recognition 65M70 (Primary), 68T07 (Secondary) I.2.6; I.4.8; G.1.8
Online Access:	https://arxiv.org/abs/2510.23117
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision
by: Raoufi, Behnam, et al.
Published: (2025)

Using Deep Learning to Generate Semantically Correct Hindi Captions
by: Khan, Wasim Akram, et al.
Published: (2026)

TAG-Head: Time-Aligned Graph Head for Plug-and-Play Fine-grained Action Recognition
by: Hassan, Imtiaz Ul, et al.
Published: (2026)

Botany Meets Robotics in Alpine Scree Monitoring
by: De Benedittis, Davide, et al.
Published: (2025)

Have We Mastered Scale in Deep Monocular Visual SLAM? The ScaleMaster Dataset and Benchmark
by: Ju, Hyoseok, et al.
Published: (2026)

From eye to AI: studying rodent social behavior in the era of machine Learning
by: Chindemi, Giuseppe, et al.
Published: (2025)

From Dead Pixels to Editable Slides: Infographic Reconstruction into Native Google Slides via Vision-Language Region Understanding
by: Gonzalez, Leonardo
Published: (2026)

Lifelong Learning in Vision-Language Models: Enhanced EWC with Cross-Modal Knowledge Retention
by: Durrani, Hamza Ahmed, et al.
Published: (2026)

Semi supervised GAN for smart microscopy, fast and data efficient cell cycle classification
by: Manick, Rajeev, et al.
Published: (2026)

Infrastructure-Centric World Models: Bridging Temporal Depth and Spatial Breadth for Roadside Perception
by: Meng, Siyuan, et al.
Published: (2026)

Akasha 2: Hamiltonian State Space Duality and Visual-Language Joint Embedding Predictive Architectur
by: Meziani, Yani
Published: (2026)

Sign Language Recognition and Translation for Low-Resource Languages: Challenges and Pathways Forward
by: Alishzade, Nigar, et al.
Published: (2026)

Intermitotic timing and motility patterns in the cell division of the diatom Seminavis robusta
by: Ziebarth, Jonas, et al.
Published: (2026)

Seeing The Words: Evaluating AI-generated Biblical Art
by: Makimei, Hidde, et al.
Published: (2025)

Digital analysis of early color photographs taken using regular color screen processes
by: Hubička, Jan, et al.
Published: (2023)

OmniAcc: Personalized Accessibility Assistant Using Generative AI
by: Karki, Siddhant, et al.
Published: (2025)

Caption-Driven Explainability: Probing CNNs for Bias via CLIP
by: Koller, Patrick, et al.
Published: (2025)

Estimating optical vegetation indices and biophysical variables for temperate forests with Sentinel-1 SAR data using machine learning techniques: A case study for Czechia
by: Paluba, Daniel, et al.
Published: (2023)

SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology
by: Wu, Dongli, et al.
Published: (2025)

Explaining What Machines See: XAI Strategies in Deep Object Detection Models
by: Seyedmomeni, FatemehSadat, et al.
Published: (2025)

From Gaze to Insight: Bridging Human Visual Attention and Vision Language Model Explanation for Weakly-Supervised Medical Image Segmentation
by: Chen, Jingkun, et al.
Published: (2025)

Domain-Adaptive Pretraining Improves Primate Behavior Recognition
by: Mueller, Felix B., et al.
Published: (2025)

Visible Iris Area as a Quality Metric for Reliable Iris Recognition Under Pupil Dilation and Eyelid Occlusion
by: Pessaud, Jack, et al.
Published: (2025)

Leum-VL Technical Report
by: He, Yuxuan, et al.
Published: (2026)

DNRSelect: Active Best View Selection for Deferred Neural Rendering
by: Wu, Dongli, et al.
Published: (2025)

FCBV-Net: Category-Level Robotic Garment Smoothing via Feature-Conditioned Bimanual Value Prediction
by: Daba, Mohammed, et al.
Published: (2025)

Context in object detection: a systematic literature review
by: Jamali, Mahtab, et al.
Published: (2025)

Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection
by: Zhang, Chuhan, et al.
Published: (2025)

A Sensorimotor Vision Transformer
by: Gadzicki, Konrad, et al.
Published: (2025)

Mask-Conditioned Voxel Diffusion for Joint Geometry and Color Inpainting
by: Sumuk, Aarya
Published: (2026)

Physical Knot Classification Beyond Accuracy: A Benchmark and Diagnostic Study
by: Nie, Shiheng, et al.
Published: (2026)

Pedestrian Detection in Low-Light Conditions: A Comprehensive Survey
by: Ghari, Bahareh, et al.
Published: (2024)

Human-Centric Perception for Child Sexual Abuse Imagery
by: Laranjeira, Camila, et al.
Published: (2026)

PhysVideoGenerator: Towards Physically Aware Video Generation via Latent Physics Guidance
by: Satish, Siddarth Nilol Kundur, et al.
Published: (2026)

FlowIBR: Leveraging Pre-Training for Efficient Neural Image-Based Rendering of Dynamic Scenes
by: Büsching, Marcel, et al.
Published: (2023)

IMASHRIMP: Automatic White Shrimp (Penaeus vannamei) Biometrical Analysis from Laboratory Images Using Computer Vision and Deep Learning
by: González, Abiam Remache, et al.
Published: (2025)

OCC-MLLM-CoT-Alpha: Towards Multi-stage Occlusion Recognition Based on Large Language Models via 3D-Aware Supervision and Chain-of-Thoughts Guidance
by: Wang, Chaoyi, et al.
Published: (2025)

NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
by: Lee, Kyuho, et al.
Published: (2025)

Reducing Object Hallucination in LVLMs via Emphasizing Image-negative Tokens
by: Shen, Meng, et al.
Published: (2026)

MAPS: A Synthetic Dataset for Probing Vision Models in a Controlled 3D Scene Space
by: Galella, Santiago, et al.
Published: (2026)