Saved in:
| Main Authors: | Wang, Dingrui, Lai, Zheyuan, Li, Yuda, Wu, Yi, Ma, Yuexin, Betz, Johannes, Yang, Ruigang, Li, Wei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.04100 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving
by: Wang, Dingrui, et al.
Published: (2024)
by: Wang, Dingrui, et al.
Published: (2024)
SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance
by: Xia, Qi, et al.
Published: (2026)
by: Xia, Qi, et al.
Published: (2026)
DRIP: Discriminative Rotation-Invariant Pole Landmark Descriptor for 3D LiDAR Localization
by: Li, Dingrui, et al.
Published: (2024)
by: Li, Dingrui, et al.
Published: (2024)
EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving
by: Schäfer, Finn Rasmus, et al.
Published: (2026)
by: Schäfer, Finn Rasmus, et al.
Published: (2026)
NPC: Neural Predictive Control for Fuel-Efficient Autonomous Trucks
by: Ren, Jiaping, et al.
Published: (2024)
by: Ren, Jiaping, et al.
Published: (2024)
One Model, Two Minds: Task-Conditioned Reasoning for Unified Image Quality and Aesthetic Assessment
by: Yin, Wen, et al.
Published: (2026)
by: Yin, Wen, et al.
Published: (2026)
State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend
by: Cui, Fei, et al.
Published: (2024)
by: Cui, Fei, et al.
Published: (2024)
Fusion of Short-term and Long-term Attention for Video Mirror Detection
by: Xu, Mingchen, et al.
Published: (2024)
by: Xu, Mingchen, et al.
Published: (2024)
GaussianFusionOcc: A Seamless Sensor Fusion Approach for 3D Occupancy Prediction Using 3D Gaussians
by: Pavković, Tomislav, et al.
Published: (2025)
by: Pavković, Tomislav, et al.
Published: (2025)
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation
by: Yang, Xiuyu, et al.
Published: (2025)
by: Yang, Xiuyu, et al.
Published: (2025)
Beyond Flat Unknown Labels in Open-World Object Detection
by: Zhang, Yuchen, et al.
Published: (2025)
by: Zhang, Yuchen, et al.
Published: (2025)
Beyond Known Objects: A Novel Framework for Open-Set Object Detection using Negative-Aware Norm
by: Zhang, Yuchen, et al.
Published: (2026)
by: Zhang, Yuchen, et al.
Published: (2026)
Coherent Online Road Topology Estimation and Reasoning with Standard-Definition Maps
by: Pham, Khanh Son, et al.
Published: (2025)
by: Pham, Khanh Son, et al.
Published: (2025)
From Shadows to Safety: Occlusion Tracking and Risk Mitigation for Urban Autonomous Driving
by: Moller, Korbinian, et al.
Published: (2025)
by: Moller, Korbinian, et al.
Published: (2025)
Target-Bench: Can Video World Models Achieve Mapless Path Planning with Semantic Targets?
by: Wang, Dingrui, et al.
Published: (2025)
by: Wang, Dingrui, et al.
Published: (2025)
OccMamba: Semantic Occupancy Prediction with State Space Models
by: Li, Heng, et al.
Published: (2024)
by: Li, Heng, et al.
Published: (2024)
OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries
by: Lu, Yuhang, et al.
Published: (2023)
by: Lu, Yuhang, et al.
Published: (2023)
Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction
by: Wu, Yi, et al.
Published: (2024)
by: Wu, Yi, et al.
Published: (2024)
Can VLMs Unlock Semantic Anomaly Detection? A Framework for Structured Reasoning
by: Brusnicki, Roberto, et al.
Published: (2025)
by: Brusnicki, Roberto, et al.
Published: (2025)
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
by: Yin, Junbo, et al.
Published: (2024)
by: Yin, Junbo, et al.
Published: (2024)
LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset
by: Wagner, Royden, et al.
Published: (2026)
by: Wagner, Royden, et al.
Published: (2026)
How Well Do Vision-Language Models Understand Sequential Driving Scenes? A Sensitivity Study
by: Brusnicki, Roberto, et al.
Published: (2026)
by: Brusnicki, Roberto, et al.
Published: (2026)
OccLE: Label-Efficient 3D Semantic Occupancy Prediction
by: Fang, Naiyu, et al.
Published: (2025)
by: Fang, Naiyu, et al.
Published: (2025)
GM-DF: Generalized Multi-Scenario Deepfake Detection
by: Lai, Yingxin, et al.
Published: (2024)
by: Lai, Yingxin, et al.
Published: (2024)
Towards Practical Human Motion Prediction with LiDAR Point Clouds
by: Han, Xiao, et al.
Published: (2024)
by: Han, Xiao, et al.
Published: (2024)
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation
by: Liang, Tianming, et al.
Published: (2025)
by: Liang, Tianming, et al.
Published: (2025)
AffordGrasp: Cross-Modal Diffusion for Affordance-Aware Grasp Synthesis
by: Wu, Xiaofei, et al.
Published: (2026)
by: Wu, Xiaofei, et al.
Published: (2026)
SALI: Short-term Alignment and Long-term Interaction Network for Colonoscopy Video Polyp Segmentation
by: Hu, Qiang, et al.
Published: (2024)
by: Hu, Qiang, et al.
Published: (2024)
SceneTracker: Long-term Scene Flow Estimation Network
by: Wang, Bo, et al.
Published: (2024)
by: Wang, Bo, et al.
Published: (2024)
VideoZoomer: Reinforcement-Learned Temporal Focusing for Long Video Reasoning
by: Ding, Yang, et al.
Published: (2025)
by: Ding, Yang, et al.
Published: (2025)
STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation
by: Wang, Jiamin, et al.
Published: (2025)
by: Wang, Jiamin, et al.
Published: (2025)
FastGrasp: Efficient Grasp Synthesis with Diffusion
by: Wu, Xiaofei, et al.
Published: (2024)
by: Wu, Xiaofei, et al.
Published: (2024)
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
by: Gao, Yuan, et al.
Published: (2025)
by: Gao, Yuan, et al.
Published: (2025)
EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning
by: Yu, Chengjun, et al.
Published: (2026)
by: Yu, Chengjun, et al.
Published: (2026)
SWAG: Long-term Surgical Workflow Prediction with Generative-based Anticipation
by: Boels, Maxence, et al.
Published: (2024)
by: Boels, Maxence, et al.
Published: (2024)
MeanFlow Transformers with Representation Autoencoders
by: Hu, Zheyuan, et al.
Published: (2025)
by: Hu, Zheyuan, et al.
Published: (2025)
Registration between Point Cloud Streams and Sequential Bounding Boxes via Gradient Descent
by: Li, Xuesong, et al.
Published: (2024)
by: Li, Xuesong, et al.
Published: (2024)
TAE: Target-aware enhancer for nighttime UAV tracking
by: Chen, Yanyan, et al.
Published: (2026)
by: Chen, Yanyan, et al.
Published: (2026)
BehaviorVLM: Unified Finetuning-Free Behavioral Understanding with Vision-Language Reasoning
by: Ke, Jingyang, et al.
Published: (2026)
by: Ke, Jingyang, et al.
Published: (2026)
Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels
by: Fu, Ruigang, et al.
Published: (2024)
by: Fu, Ruigang, et al.
Published: (2024)
Similar Items
-
DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving
by: Wang, Dingrui, et al.
Published: (2024) -
SocialMirror: Reconstructing 3D Human Interaction Behaviors from Monocular Videos with Semantic and Geometric Guidance
by: Xia, Qi, et al.
Published: (2026) -
DRIP: Discriminative Rotation-Invariant Pole Landmark Descriptor for 3D LiDAR Localization
by: Li, Dingrui, et al.
Published: (2024) -
EgoDyn-Bench: Evaluating Ego-Motion Understanding in Vision-Centric Foundation Models for Autonomous Driving
by: Schäfer, Finn Rasmus, et al.
Published: (2026) -
NPC: Neural Predictive Control for Fuel-Efficient Autonomous Trucks
by: Ren, Jiaping, et al.
Published: (2024)