Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hu, Die, Li, Henan
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Robotics 90B06 (Transportation, logistics) I.2.10; J.1
Online Access:	https://arxiv.org/abs/2604.07912
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913018361675776
author	Hu, Die Li, Henan
author_facet	Hu, Die Li, Henan
contents	Finding parking consumes a disproportionate share of food delivery time, yet no system addresses precise parking-spot selection relative to merchant entrances. We propose ParkSense, a framework that repurposes idle compute during low-risk AV states -- queuing at red lights, traffic congestion, parking-lot crawl -- to run a Vision-Language Model (VLM) on pre-cached satellite and street view imagery, identifying entrances and legal parking zones. We formalize the Delivery-Aware Precision Parking (DAPP) problem, show that a quantized 7B VLM completes inference in 4-8 seconds on HW4-class hardware, and estimate annual per-driver income gains of 3,000-8,000 USD in the U.S. Five open research directions are identified at this unexplored intersection of autonomous driving, computer vision, and last-mile logistics.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_07912
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	ParkSense: Where Should a Delivery Driver Park? Leveraging Idle AV Compute and Vision-Language Models Hu, Die Li, Henan Computer Vision and Pattern Recognition Robotics 90B06 (Transportation, logistics) I.2.10; J.1 Finding parking consumes a disproportionate share of food delivery time, yet no system addresses precise parking-spot selection relative to merchant entrances. We propose ParkSense, a framework that repurposes idle compute during low-risk AV states -- queuing at red lights, traffic congestion, parking-lot crawl -- to run a Vision-Language Model (VLM) on pre-cached satellite and street view imagery, identifying entrances and legal parking zones. We formalize the Delivery-Aware Precision Parking (DAPP) problem, show that a quantized 7B VLM completes inference in 4-8 seconds on HW4-class hardware, and estimate annual per-driver income gains of 3,000-8,000 USD in the U.S. Five open research directions are identified at this unexplored intersection of autonomous driving, computer vision, and last-mile logistics.
title	ParkSense: Where Should a Delivery Driver Park? Leveraging Idle AV Compute and Vision-Language Models
topic	Computer Vision and Pattern Recognition Robotics 90B06 (Transportation, logistics) I.2.10; J.1
url	https://arxiv.org/abs/2604.07912

Similar Items