Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yang, Le, Chen, Ruoyu, Liu, Haijun, Liang, Jiawei, Sun, ShangQuan, Cao, Xiaochun
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2605.06264
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866914539632590848
author Yang, Le
Chen, Ruoyu
Liu, Haijun
Liang, Jiawei
Sun, ShangQuan
Cao, Xiaochun
author_facet Yang, Le
Chen, Ruoyu
Liu, Haijun
Liang, Jiawei
Sun, ShangQuan
Cao, Xiaochun
contents End-to-end autonomous driving models generate future trajectories from multi-view inputs, improving system integration but introducing opaque decisions and hard-to-localize risks. Existing methods either rely on auxiliary monitoring models or generate textual explanations, but are decoupled from the planning process and fail to reveal the visual evidence underlying trajectory generation. While attribution offers a direct alternative, planning differs from image classification by taking six-view camera images as input and predicting continuous multi-step trajectories, requiring attribution to capture both critical views and regions and their influence on outputs. Moreover, whether attribution maps can support risk identification remains underexplored. To address this, we propose a hierarchical attribution framework for end-to-end planning. Specifically, using L2 consistency with the original trajectory as the objective, we design a coarse-to-fine region attribution strategy that searches candidate regions across the full six-view input and refines attribution within them. We further extract three attribution statistics as predictive signals for planning risk, including attribution entropy to measure how concentrated the planner's reliance is over the joint visual space, within-camera spatial variance to characterize how spread out the attribution is within each view, and cross-camera Gini coefficient to quantify how unevenly attribution is distributed across the six cameras. Experiments on BridgeAD, UniAD, and GenAD show that these statistics correlate with planning risk, achieving Spearman correlations of $0.30 \pm 0.07$ with trajectory error and AUROC of $0.77 \pm 0.04$ for collision detection. The signal generalizes to held-out scenes with negligible degradation and remains stable under an alternative attribution baseline.
format Preprint
id arxiv_https___arxiv_org_abs_2605_06264
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Can Attribution Predict Risk? From Multi-View Attribution to Planning Risk Signals in End-to-End Autonomous Driving
Yang, Le
Chen, Ruoyu
Liu, Haijun
Liang, Jiawei
Sun, ShangQuan
Cao, Xiaochun
Machine Learning
End-to-end autonomous driving models generate future trajectories from multi-view inputs, improving system integration but introducing opaque decisions and hard-to-localize risks. Existing methods either rely on auxiliary monitoring models or generate textual explanations, but are decoupled from the planning process and fail to reveal the visual evidence underlying trajectory generation. While attribution offers a direct alternative, planning differs from image classification by taking six-view camera images as input and predicting continuous multi-step trajectories, requiring attribution to capture both critical views and regions and their influence on outputs. Moreover, whether attribution maps can support risk identification remains underexplored. To address this, we propose a hierarchical attribution framework for end-to-end planning. Specifically, using L2 consistency with the original trajectory as the objective, we design a coarse-to-fine region attribution strategy that searches candidate regions across the full six-view input and refines attribution within them. We further extract three attribution statistics as predictive signals for planning risk, including attribution entropy to measure how concentrated the planner's reliance is over the joint visual space, within-camera spatial variance to characterize how spread out the attribution is within each view, and cross-camera Gini coefficient to quantify how unevenly attribution is distributed across the six cameras. Experiments on BridgeAD, UniAD, and GenAD show that these statistics correlate with planning risk, achieving Spearman correlations of $0.30 \pm 0.07$ with trajectory error and AUROC of $0.77 \pm 0.04$ for collision detection. The signal generalizes to held-out scenes with negligible degradation and remains stable under an alternative attribution baseline.
title Can Attribution Predict Risk? From Multi-View Attribution to Planning Risk Signals in End-to-End Autonomous Driving
topic Machine Learning
url https://arxiv.org/abs/2605.06264