Saved in:
Bibliographic Details
Main Authors: Qian, Kangan, Luo, Ziang, Jiang, Sicong, Huang, Zilin, Miao, Jinyu, Ma, Zhikun, Zhu, Tianze, Li, Jiayin, He, Yangfan, Fu, Zheng, Shi, Yining, Wang, Boyue, Lin, Hezhe, Chen, Ziyu, Yu, Jiangbo, Jiao, Xinyu, Yang, Mengmeng, Jiang, Kun, Yang, Diange
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.08162
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Ensuring safe, comfortable, and efficient planning is crucial for autonomous driving systems. While end-to-end models trained on large datasets perform well in standard driving scenarios, they struggle with complex low-frequency events. Recent Large Language Models (LLMs) and Vision Language Models (VLMs) advancements offer enhanced reasoning but suffer from computational inefficiency. Inspired by the dual-process cognitive model "Thinking, Fast and Slow", we propose $\textbf{FASIONAD}$ -- a novel dual-system framework that synergizes a fast end-to-end planner with a VLM-based reasoning module. The fast system leverages end-to-end learning to achieve real-time trajectory generation in common scenarios, while the slow system activates through uncertainty estimation to perform contextual analysis and complex scenario resolution. Our architecture introduces three key innovations: (1) A dynamic switching mechanism enabling slow system intervention based on real-time uncertainty assessment; (2) An information bottleneck with high-level plan feedback that optimizes the slow system's guidance capability; (3) A bidirectional knowledge exchange where visual prompts enhance the slow system's reasoning while its feedback refines the fast planner's decision-making. To strengthen VLM reasoning, we develop a question-answering mechanism coupled with reward-instruct training strategy. In open-loop experiments, FASIONAD achieves a $6.7\%$ reduction in average $L2$ trajectory error and $28.1\%$ lower collision rate.