:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Mandalika, Sriram
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Computation and Language Emerging Technologies
Online Access:	https://arxiv.org/abs/2605.25708
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

X-Driver: Explainable Autonomous Driving with Vision-Language Models
by: Liu, Wei, et al.
Published: (2025)

Evaluating and Enhancing Trustworthiness of LLMs in Perception Tasks
by: Dona, Malsha Ashani Mahawatta, et al.
Published: (2024)

EgoPoseVR: Spatiotemporal Multi-Modal Reasoning for Egocentric Full-Body Pose in Virtual Reality
by: Cheng, Haojie, et al.
Published: (2026)

SpecTrack: Learned Multi-Rotation Tracking via Speckle Imaging
by: Chen, Ziyang, et al.
Published: (2024)

Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
by: Wang, Dongdong, et al.
Published: (2026)

TimeSpot: Benchmarking Geo-Temporal Understanding in Vision-Language Models in Real-World Settings
by: Wasi, Azmine Toushik, et al.
Published: (2026)

Towards Railway Domain Adaptation for LiDAR-based 3D Detection: Road-to-Rail and Sim-to-Real via SynDRA-BBox
by: Diaz, Xavier, et al.
Published: (2025)

Multi-Image Super Resolution Framework for Detection and Analysis of Plant Roots
by: Agarwal, Shubham, et al.
Published: (2026)

Learned Display Radiance Fields with Lensless Cameras
by: Chen, Ziyang, et al.
Published: (2025)

PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach
by: Rai, Nitin, et al.
Published: (2025)

Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review
by: Mots'oehli, Moseli
Published: (2024)

Attention-based Generative Latent Replay: A Continual Learning Approach for WSI Analysis
by: Kumari, Pratibha, et al.
Published: (2025)

Enhancing Autism Spectrum Disorder Early Detection with the Parent-Child Dyads Block-Play Protocol and an Attention-enhanced GCN-xLSTM Hybrid Deep Learning Framework
by: Li, Xiang, et al.
Published: (2024)

SAM-SP: Self-Prompting Makes SAM Great Again
by: Zhou, Chunpeng, et al.
Published: (2024)

Class-Adaptive Cooperative Perception for Multi-Class LiDAR-based 3D Object Detection in V2X Systems
by: Kyem, Blessing Agyei, et al.
Published: (2026)

DATTA: Domain-Adversarial Test-Time Adaptation for Cross-Domain WiFi-Based Human Activity Recognition
by: Strohmayer, Julian, et al.
Published: (2024)

Prompt to Protection: A Comparative Study of Multimodal LLMs in Construction Hazard Recognition
by: Chaudhary, Nishi, et al.
Published: (2025)

Semi-Supervised Multimodal Multi-Instance Learning for Aortic Stenosis Diagnosis
by: Huang, Zhe, et al.
Published: (2024)

Generalizable Vision-Language Few-Shot Adaptation with Predictive Prompts and Negative Learning
by: Mandalika, Sriram
Published: (2025)

VOLMO: Versatile and Open Large Models for Ophthalmology
by: Qin, Zhenyue, et al.
Published: (2026)

Self-evolving Embodied AI
by: Feng, Tongtong, et al.
Published: (2026)

From Pixels to Nucleotides: End-to-End Token-Based Video Compression for DNA Storage
by: Ruan, Cihan, et al.
Published: (2026)

INSIGHT: Indoor Scene Intelligence from Geometric-Semantic Hierarchy Transfer for Public~Safety
by: Dimopoulos, Alexander Nikitas, et al.
Published: (2026)

All-Optical Segmentation via Diffractive Neural Networks for Autonomous Driving
by: Li, Yingjie, et al.
Published: (2026)

Can Foundation Models Revolutionize Mobile AR Sparse Sensing?
by: Zhao, Yiqin, et al.
Published: (2025)

Introducing Nylon Face Mask Attacks: A Dataset for Evaluating Generalised Face Presentation Attack Detection
by: Manasa, et al.
Published: (2025)

SynSpill: Improved Industrial Spill Detection With Synthetic Data
by: Baranwal, Aaditya, et al.
Published: (2025)

Scrutinizing Data from Sky: An Examination of Its Veracity in Area Based Traffic Contexts
by: Ali, Yawar, et al.
Published: (2024)

Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
by: Hamann, Friedhelm, et al.
Published: (2024)

x-RAGE: eXtended Reality -- Action & Gesture Events Dataset
by: Parmar, Vivek, et al.
Published: (2024)

Fast Quantum Convolutional Neural Networks for Low-Complexity Object Detection in Autonomous Driving Applications
by: Baek, Hankyul, et al.
Published: (2023)

DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
by: Joshi, Durga, et al.
Published: (2025)

Extracting Object Heights From LiDAR & Aerial Imagery
by: Guerrero, Jesus
Published: (2024)

Unlocking Comics: The AI4VA Dataset for Visual Understanding
by: Grönquist, Peter, et al.
Published: (2024)

Diff-GNSS: Diffusion-based Pseudorange Error Estimation
by: Zhu, Jiaqi, et al.
Published: (2025)

PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
by: Wu, Ziyu, et al.
Published: (2025)

EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
by: Zhang, Jiangning, et al.
Published: (2022)

Probabilistic Online Event Downsampling
by: Girbau-Xalabarder, Andreu, et al.
Published: (2025)

A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild
by: Kireev, Klim, et al.
Published: (2025)

Hyperspectral Sensors and Autonomous Driving: Technologies, Limitations, and Opportunities
by: Shah, Imad Ali, et al.
Published: (2025)