Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.07912 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913018361675776 |
|---|---|
| author | Hu, Die Li, Henan |
| author_facet | Hu, Die Li, Henan |
| contents | Finding parking consumes a disproportionate share of food delivery time, yet no system addresses precise parking-spot selection relative to merchant entrances. We propose ParkSense, a framework that repurposes idle compute during low-risk AV states -- queuing at red lights, traffic congestion, parking-lot crawl -- to run a Vision-Language Model (VLM) on pre-cached satellite and street view imagery, identifying entrances and legal parking zones. We formalize the Delivery-Aware Precision Parking (DAPP) problem, show that a quantized 7B VLM completes inference in 4-8 seconds on HW4-class hardware, and estimate annual per-driver income gains of 3,000-8,000 USD in the U.S. Five open research directions are identified at this unexplored intersection of autonomous driving, computer vision, and last-mile logistics. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_07912 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | ParkSense: Where Should a Delivery Driver Park? Leveraging Idle AV Compute and Vision-Language Models Hu, Die Li, Henan Computer Vision and Pattern Recognition Robotics 90B06 (Transportation, logistics) I.2.10; J.1 Finding parking consumes a disproportionate share of food delivery time, yet no system addresses precise parking-spot selection relative to merchant entrances. We propose ParkSense, a framework that repurposes idle compute during low-risk AV states -- queuing at red lights, traffic congestion, parking-lot crawl -- to run a Vision-Language Model (VLM) on pre-cached satellite and street view imagery, identifying entrances and legal parking zones. We formalize the Delivery-Aware Precision Parking (DAPP) problem, show that a quantized 7B VLM completes inference in 4-8 seconds on HW4-class hardware, and estimate annual per-driver income gains of 3,000-8,000 USD in the U.S. Five open research directions are identified at this unexplored intersection of autonomous driving, computer vision, and last-mile logistics. |
| title | ParkSense: Where Should a Delivery Driver Park? Leveraging Idle AV Compute and Vision-Language Models |
| topic | Computer Vision and Pattern Recognition Robotics 90B06 (Transportation, logistics) I.2.10; J.1 |
| url | https://arxiv.org/abs/2604.07912 |