Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Guo, Ziang, Min, Chen, Zhang, Xuefeng, Zhou, Yixiao, Wang, Shuo, Zheng, Sifa, Tsetserukou, Dzmitry, Zhang, Zufeng
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2604.28111
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914568619425792
author	Guo, Ziang Min, Chen Zhang, Xuefeng Zhou, Yixiao Wang, Shuo Zheng, Sifa Tsetserukou, Dzmitry Zhang, Zufeng
author_facet	Guo, Ziang Min, Chen Zhang, Xuefeng Zhou, Yixiao Wang, Shuo Zheng, Sifa Tsetserukou, Dzmitry Zhang, Zufeng
contents	End-to-end (E2E) autonomous driving aims to directly map sensory observations to driving actions, but its real-world deployment is hindered by evolving data distributions and the high cost of continual annotation. While combining imitation learning (IL) and reinforcement learning (RL) is a common strategy for policy improvement, conventional RL training relies on delayed, event-based rewards, where policies learn only from catastrophic outcomes such as collisions, leading to premature convergence to suboptimal behaviors. To address these limitations, we propose GSDrive, a framework that uses a differentiable 3D Gaussian Splatting (3DGS) environment for future-aware trajectory probing and reward shaping in E2E driving. GSDrive first learns a multi-mode trajectory probe via IL and then uses RL to evaluate multiple candidate futures in the 3DGS environment, converting their simulated returns into dense shaping rewards for policy optimization. This yields a cyclic hybrid IL-RL training loop, where IL supplies structured future priors and RL provides interactive feedback for iterative refinement. Evaluated on the reconstructed nuScenes dataset, our method outperforms other simulation-based RL approaches in closed-loop experiments. Code is available at https://github.com/ZionGo6/GSDrive.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_28111
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	GSDrive: Reinforcing Driving Policies by Multi-mode Future Trajectory Probing with 3D Gaussian Splatting Environment Guo, Ziang Min, Chen Zhang, Xuefeng Zhou, Yixiao Wang, Shuo Zheng, Sifa Tsetserukou, Dzmitry Zhang, Zufeng Robotics End-to-end (E2E) autonomous driving aims to directly map sensory observations to driving actions, but its real-world deployment is hindered by evolving data distributions and the high cost of continual annotation. While combining imitation learning (IL) and reinforcement learning (RL) is a common strategy for policy improvement, conventional RL training relies on delayed, event-based rewards, where policies learn only from catastrophic outcomes such as collisions, leading to premature convergence to suboptimal behaviors. To address these limitations, we propose GSDrive, a framework that uses a differentiable 3D Gaussian Splatting (3DGS) environment for future-aware trajectory probing and reward shaping in E2E driving. GSDrive first learns a multi-mode trajectory probe via IL and then uses RL to evaluate multiple candidate futures in the 3DGS environment, converting their simulated returns into dense shaping rewards for policy optimization. This yields a cyclic hybrid IL-RL training loop, where IL supplies structured future priors and RL provides interactive feedback for iterative refinement. Evaluated on the reconstructed nuScenes dataset, our method outperforms other simulation-based RL approaches in closed-loop experiments. Code is available at https://github.com/ZionGo6/GSDrive.
title	GSDrive: Reinforcing Driving Policies by Multi-mode Future Trajectory Probing with 3D Gaussian Splatting Environment
topic	Robotics
url	https://arxiv.org/abs/2604.28111

Similar Items