:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Huaiyuan, Chen, Junliang, Meng, Shiyu, Wang, Yi, Chau, Lap-Pui
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Robotics
Online Access:	https://arxiv.org/abs/2405.05173
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner Framework
by: Chen, Junliang, et al.
Published: (2025)

A Survey of Embodied Learning for Object-Centric Robotic Manipulation
by: Zheng, Ying, et al.
Published: (2024)

EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization
by: Wang, Xiaoqi, et al.
Published: (2025)

Drive-P2D: A Progressive Perception-to-Decision Benchmark for VLMs in Autonomous Driving
by: Tang, Zecong, et al.
Published: (2026)

A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving
by: Zhang, Yi, et al.
Published: (2025)

SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving
by: Cao, Helin, et al.
Published: (2025)

M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving
by: Xu, Dongyang, et al.
Published: (2024)

Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots
by: Cui, Wei, et al.
Published: (2025)

Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving
by: Ren, Shunli, et al.
Published: (2023)

Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
by: Fu, Ao, et al.
Published: (2024)

A Survey on Vision-Language-Action Models for Autonomous Driving
by: Jiang, Sicong, et al.
Published: (2025)

SQS: Enhancing Sparse Perception Models via Query-based Splatting in Autonomous Driving
by: Zhang, Haiming, et al.
Published: (2025)

GaussianCross: Cross-modal Self-supervised 3D Representation Learning via Gaussian Splatting
by: Yao, Lei, et al.
Published: (2025)

Data Shift of Object Detection in Autonomous Driving
by: Xu, Lida
Published: (2025)

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
by: Wang, Yuping, et al.
Published: (2025)

CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting
by: Liao, Haicheng, et al.
Published: (2025)

EARL: Towards a Unified Analysis-Guided Reinforcement Learning Framework for Egocentric Interaction Reasoning and Pixel Grounding
by: Su, Yuejiao, et al.
Published: (2026)

STELLAR: Scaling 3D Perception Large Models for Autonomous Driving
by: Li, Yingwei, et al.
Published: (2026)

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning
by: Sun, Haoran, et al.
Published: (2025)

HSNet: Heterogeneous Subgraph Network for Single Image Super-resolution
by: Hu, Qiongyang, et al.
Published: (2025)

S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving
by: Wu, Zhiyuan, et al.
Published: (2024)

UnO: Unsupervised Occupancy Fields for Perception and Forecasting
by: Agro, Ben, et al.
Published: (2024)

Generative AI for Autonomous Driving: Frontiers and Opportunities
by: Wang, Yuping, et al.
Published: (2025)

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale
by: Zuo, Sicheng, et al.
Published: (2026)

KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System
by: Xia, Zhongyu, et al.
Published: (2025)

Less is More: Lean yet Powerful Vision-Language Model for Autonomous Driving
by: Yang, Sheng, et al.
Published: (2025)

DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving
by: Zhou, Yang, et al.
Published: (2026)

CLOVER: Closed-Loop Value Estimation and Ranking for End-to-End Autonomous Driving Planning
by: Ang, Sining, et al.
Published: (2026)

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey
by: Zhong, Juan, et al.
Published: (2023)

SonarSweep: Fusing Sonar and Vision for Robust 3D Reconstruction via Plane Sweeping
by: Chen, Lingpeng, et al.
Published: (2025)

HiLO: High-Level Object Fusion for Autonomous Driving using Transformers
by: Osterburg, Timo, et al.
Published: (2025)

TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction
by: Ming, Zhenxing, et al.
Published: (2026)

LingoQA: Visual Question Answering for Autonomous Driving
by: Marcu, Ana-Maria, et al.
Published: (2023)

DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving
by: Yang, Xuemeng, et al.
Published: (2024)

VERDI: VLM-Embedded Reasoning for Autonomous Driving
by: Feng, Bowen, et al.
Published: (2025)

Generative AI for Autonomous Driving: A Review
by: Winter, Katharina, et al.
Published: (2025)

DriveCritic: Towards Context-Aware, Human-Aligned Evaluation for Autonomous Driving with Vision-Language Models
by: Song, Jingyu, et al.
Published: (2025)

DriveVLM-RL: Neuroscience-Inspired Reinforcement Learning with Vision-Language Models for Safe and Deployable Autonomous Driving
by: Huang, Zilin, et al.
Published: (2026)

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving
by: Fan, Mingyu, et al.
Published: (2026)

DiffVLA: Vision-Language Guided Diffusion Planning for Autonomous Driving
by: Jiang, Anqing, et al.
Published: (2025)