Saved in:
| Main Author: | Mandalika, Sriram |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.25708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
X-Driver: Explainable Autonomous Driving with Vision-Language Models
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
Evaluating and Enhancing Trustworthiness of LLMs in Perception Tasks
by: Dona, Malsha Ashani Mahawatta, et al.
Published: (2024)
by: Dona, Malsha Ashani Mahawatta, et al.
Published: (2024)
EgoPoseVR: Spatiotemporal Multi-Modal Reasoning for Egocentric Full-Body Pose in Virtual Reality
by: Cheng, Haojie, et al.
Published: (2026)
by: Cheng, Haojie, et al.
Published: (2026)
SpecTrack: Learned Multi-Rotation Tracking via Speckle Imaging
by: Chen, Ziyang, et al.
Published: (2024)
by: Chen, Ziyang, et al.
Published: (2024)
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
by: Wang, Dongdong, et al.
Published: (2026)
by: Wang, Dongdong, et al.
Published: (2026)
TimeSpot: Benchmarking Geo-Temporal Understanding in Vision-Language Models in Real-World Settings
by: Wasi, Azmine Toushik, et al.
Published: (2026)
by: Wasi, Azmine Toushik, et al.
Published: (2026)
Towards Railway Domain Adaptation for LiDAR-based 3D Detection: Road-to-Rail and Sim-to-Real via SynDRA-BBox
by: Diaz, Xavier, et al.
Published: (2025)
by: Diaz, Xavier, et al.
Published: (2025)
Multi-Image Super Resolution Framework for Detection and Analysis of Plant Roots
by: Agarwal, Shubham, et al.
Published: (2026)
by: Agarwal, Shubham, et al.
Published: (2026)
Learned Display Radiance Fields with Lensless Cameras
by: Chen, Ziyang, et al.
Published: (2025)
by: Chen, Ziyang, et al.
Published: (2025)
PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach
by: Rai, Nitin, et al.
Published: (2025)
by: Rai, Nitin, et al.
Published: (2025)
Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review
by: Mots'oehli, Moseli
Published: (2024)
by: Mots'oehli, Moseli
Published: (2024)
Attention-based Generative Latent Replay: A Continual Learning Approach for WSI Analysis
by: Kumari, Pratibha, et al.
Published: (2025)
by: Kumari, Pratibha, et al.
Published: (2025)
Enhancing Autism Spectrum Disorder Early Detection with the Parent-Child Dyads Block-Play Protocol and an Attention-enhanced GCN-xLSTM Hybrid Deep Learning Framework
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
SAM-SP: Self-Prompting Makes SAM Great Again
by: Zhou, Chunpeng, et al.
Published: (2024)
by: Zhou, Chunpeng, et al.
Published: (2024)
Class-Adaptive Cooperative Perception for Multi-Class LiDAR-based 3D Object Detection in V2X Systems
by: Kyem, Blessing Agyei, et al.
Published: (2026)
by: Kyem, Blessing Agyei, et al.
Published: (2026)
DATTA: Domain-Adversarial Test-Time Adaptation for Cross-Domain WiFi-Based Human Activity Recognition
by: Strohmayer, Julian, et al.
Published: (2024)
by: Strohmayer, Julian, et al.
Published: (2024)
Prompt to Protection: A Comparative Study of Multimodal LLMs in Construction Hazard Recognition
by: Chaudhary, Nishi, et al.
Published: (2025)
by: Chaudhary, Nishi, et al.
Published: (2025)
Semi-Supervised Multimodal Multi-Instance Learning for Aortic Stenosis Diagnosis
by: Huang, Zhe, et al.
Published: (2024)
by: Huang, Zhe, et al.
Published: (2024)
Generalizable Vision-Language Few-Shot Adaptation with Predictive Prompts and Negative Learning
by: Mandalika, Sriram
Published: (2025)
by: Mandalika, Sriram
Published: (2025)
VOLMO: Versatile and Open Large Models for Ophthalmology
by: Qin, Zhenyue, et al.
Published: (2026)
by: Qin, Zhenyue, et al.
Published: (2026)
Self-evolving Embodied AI
by: Feng, Tongtong, et al.
Published: (2026)
by: Feng, Tongtong, et al.
Published: (2026)
From Pixels to Nucleotides: End-to-End Token-Based Video Compression for DNA Storage
by: Ruan, Cihan, et al.
Published: (2026)
by: Ruan, Cihan, et al.
Published: (2026)
INSIGHT: Indoor Scene Intelligence from Geometric-Semantic Hierarchy Transfer for Public~Safety
by: Dimopoulos, Alexander Nikitas, et al.
Published: (2026)
by: Dimopoulos, Alexander Nikitas, et al.
Published: (2026)
All-Optical Segmentation via Diffractive Neural Networks for Autonomous Driving
by: Li, Yingjie, et al.
Published: (2026)
by: Li, Yingjie, et al.
Published: (2026)
Can Foundation Models Revolutionize Mobile AR Sparse Sensing?
by: Zhao, Yiqin, et al.
Published: (2025)
by: Zhao, Yiqin, et al.
Published: (2025)
Introducing Nylon Face Mask Attacks: A Dataset for Evaluating Generalised Face Presentation Attack Detection
by: Manasa, et al.
Published: (2025)
by: Manasa, et al.
Published: (2025)
SynSpill: Improved Industrial Spill Detection With Synthetic Data
by: Baranwal, Aaditya, et al.
Published: (2025)
by: Baranwal, Aaditya, et al.
Published: (2025)
Scrutinizing Data from Sky: An Examination of Its Veracity in Area Based Traffic Contexts
by: Ali, Yawar, et al.
Published: (2024)
by: Ali, Yawar, et al.
Published: (2024)
Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
by: Hamann, Friedhelm, et al.
Published: (2024)
by: Hamann, Friedhelm, et al.
Published: (2024)
x-RAGE: eXtended Reality -- Action & Gesture Events Dataset
by: Parmar, Vivek, et al.
Published: (2024)
by: Parmar, Vivek, et al.
Published: (2024)
Fast Quantum Convolutional Neural Networks for Low-Complexity Object Detection in Autonomous Driving Applications
by: Baek, Hankyul, et al.
Published: (2023)
by: Baek, Hankyul, et al.
Published: (2023)
DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
by: Joshi, Durga, et al.
Published: (2025)
by: Joshi, Durga, et al.
Published: (2025)
Extracting Object Heights From LiDAR & Aerial Imagery
by: Guerrero, Jesus
Published: (2024)
by: Guerrero, Jesus
Published: (2024)
Unlocking Comics: The AI4VA Dataset for Visual Understanding
by: Grönquist, Peter, et al.
Published: (2024)
by: Grönquist, Peter, et al.
Published: (2024)
Diff-GNSS: Diffusion-based Pseudorange Error Estimation
by: Zhu, Jiaqi, et al.
Published: (2025)
by: Zhu, Jiaqi, et al.
Published: (2025)
PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing
by: Wu, Ziyu, et al.
Published: (2025)
by: Wu, Ziyu, et al.
Published: (2025)
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
by: Zhang, Jiangning, et al.
Published: (2022)
by: Zhang, Jiangning, et al.
Published: (2022)
Probabilistic Online Event Downsampling
by: Girbau-Xalabarder, Andreu, et al.
Published: (2025)
by: Girbau-Xalabarder, Andreu, et al.
Published: (2025)
A Manually Annotated Image-Caption Dataset for Detecting Children in the Wild
by: Kireev, Klim, et al.
Published: (2025)
by: Kireev, Klim, et al.
Published: (2025)
Hyperspectral Sensors and Autonomous Driving: Technologies, Limitations, and Opportunities
by: Shah, Imad Ali, et al.
Published: (2025)
by: Shah, Imad Ali, et al.
Published: (2025)
Similar Items
-
X-Driver: Explainable Autonomous Driving with Vision-Language Models
by: Liu, Wei, et al.
Published: (2025) -
Evaluating and Enhancing Trustworthiness of LLMs in Perception Tasks
by: Dona, Malsha Ashani Mahawatta, et al.
Published: (2024) -
EgoPoseVR: Spatiotemporal Multi-Modal Reasoning for Egocentric Full-Body Pose in Virtual Reality
by: Cheng, Haojie, et al.
Published: (2026) -
SpecTrack: Learned Multi-Rotation Tracking via Speckle Imaging
by: Chen, Ziyang, et al.
Published: (2024) -
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
by: Wang, Dongdong, et al.
Published: (2026)