Saved in:
| Main Authors: | Zhang, Pengcheng, Bai, Xiao, Zheng, Jin, Ning, Xin |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.04967 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Prompting Continual Person Search
by: Zhang, Pengcheng, et al.
Published: (2024)
by: Zhang, Pengcheng, et al.
Published: (2024)
FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM
by: Wu, Yuchen, et al.
Published: (2025)
by: Wu, Yuchen, et al.
Published: (2025)
Fully Unified Motion Planning for End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)
by: Liu, Lin, et al.
Published: (2025)
Drive-JEPA: Video JEPA Meets Multimodal Trajectory Distillation for End-to-End Driving
by: Wang, Linhan, et al.
Published: (2026)
by: Wang, Linhan, et al.
Published: (2026)
Bridging the Gap Between End-to-End and Two-Step Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)
by: Huang, Mingxin, et al.
Published: (2024)
Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
by: Zhang, Bozhou, et al.
Published: (2025)
by: Zhang, Bozhou, et al.
Published: (2025)
InterMesh: Explicit Interaction-Aware End-to-End Multi-Person Human Mesh Recovery
by: Zheng, Kaili, et al.
Published: (2026)
by: Zheng, Kaili, et al.
Published: (2026)
An End-to-End Framework for Video Multi-Person Pose Estimation
by: Wei, Zhihong
Published: (2025)
by: Wei, Zhihong
Published: (2025)
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
by: Ma, Zehong, et al.
Published: (2025)
by: Ma, Zehong, et al.
Published: (2025)
HAD: Combining Hierarchical Diffusion with Metric-Decoupled RL for End-to-End Driving
by: Yao, Wenhao, et al.
Published: (2026)
by: Yao, Wenhao, et al.
Published: (2026)
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
by: Zhong, Yufeng, et al.
Published: (2026)
by: Zhong, Yufeng, et al.
Published: (2026)
DeepFRC: An End-to-End Deep Learning Model for Functional Registration and Classification
by: Jiang, Siyuan, et al.
Published: (2025)
by: Jiang, Siyuan, et al.
Published: (2025)
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning
by: Zheng, Qiaoyu, et al.
Published: (2025)
by: Zheng, Qiaoyu, et al.
Published: (2025)
GenAD: Generative End-to-End Autonomous Driving
by: Zheng, Wenzhao, et al.
Published: (2024)
by: Zheng, Wenzhao, et al.
Published: (2024)
Driving with A Thousand Faces: A Benchmark for Closed-Loop Personalized End-to-End Autonomous Driving
by: Dong, Xiaoru, et al.
Published: (2026)
by: Dong, Xiaoru, et al.
Published: (2026)
OneVision: An End-to-End Generative Framework for Multi-view E-commerce Vision Search
by: Zheng, Zexin, et al.
Published: (2025)
by: Zheng, Zexin, et al.
Published: (2025)
End-to-End Multi-Person Pose Estimation with Pose-Aware Video Transformer
by: Yu, Yonghui, et al.
Published: (2025)
by: Yu, Yonghui, et al.
Published: (2025)
A Paradigm Shift: Fully End-to-End Training for Temporal Sentence Grounding in Videos
by: He, Allen, et al.
Published: (2026)
by: He, Allen, et al.
Published: (2026)
E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition
by: Zhang, Meng, et al.
Published: (2026)
by: Zhang, Meng, et al.
Published: (2026)
RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution
by: Jian, Siyong, et al.
Published: (2026)
by: Jian, Siyong, et al.
Published: (2026)
Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage
by: Sun, Zhengwentai, et al.
Published: (2025)
by: Sun, Zhengwentai, et al.
Published: (2025)
Towards Collaborative Autonomous Driving: Simulation Platform and End-to-End System
by: Liu, Genjia, et al.
Published: (2024)
by: Liu, Genjia, et al.
Published: (2024)
Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving
by: Yang, Jiawei, et al.
Published: (2025)
by: Yang, Jiawei, et al.
Published: (2025)
E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
by: Zhang, Jiaqing, et al.
Published: (2024)
by: Zhang, Jiaqing, et al.
Published: (2024)
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models
by: Cheng, Hao, et al.
Published: (2024)
by: Cheng, Hao, et al.
Published: (2024)
VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
by: Zhu, Jiaying, et al.
Published: (2025)
by: Zhu, Jiaying, et al.
Published: (2025)
Referring Expression Instance Retrieval and A Strong End-to-End Baseline
by: Hao, Xiangzhao, et al.
Published: (2025)
by: Hao, Xiangzhao, et al.
Published: (2025)
SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows
by: Zhao, Qinyu, et al.
Published: (2025)
by: Zhao, Qinyu, et al.
Published: (2025)
OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
by: Wang, Guan, et al.
Published: (2024)
by: Wang, Guan, et al.
Published: (2024)
Leveraging Image Matching Toward End-to-End Relative Camera Pose Regression
by: Khatib, Fadi, et al.
Published: (2022)
by: Khatib, Fadi, et al.
Published: (2022)
EVE: Towards End-to-End Video Subtitle Extraction with Vision-Language Models
by: Yu, Haiyang, et al.
Published: (2025)
by: Yu, Haiyang, et al.
Published: (2025)
SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
by: Sun, Wenchao, et al.
Published: (2024)
by: Sun, Wenchao, et al.
Published: (2024)
End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection
by: Wang, Fei, et al.
Published: (2025)
by: Wang, Fei, et al.
Published: (2025)
Decoupling Scene Perception and Ego Status: A Multi-Context Fusion Approach for Enhanced Generalization in End-to-End Autonomous Driving
by: Tang, Jiacheng, et al.
Published: (2025)
by: Tang, Jiacheng, et al.
Published: (2025)
LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification
by: Cai, Xin, et al.
Published: (2024)
by: Cai, Xin, et al.
Published: (2024)
End-to-End Vision Tokenizer Tuning
by: Wang, Wenxuan, et al.
Published: (2025)
by: Wang, Wenxuan, et al.
Published: (2025)
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
by: Gong, Kehong, et al.
Published: (2026)
by: Gong, Kehong, et al.
Published: (2026)
From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos
by: Qiao, Tanqiu, et al.
Published: (2024)
by: Qiao, Tanqiu, et al.
Published: (2024)
End-to-End HOI Reconstruction Transformer with Graph-based Encoding
by: Wang, Zhenrong, et al.
Published: (2025)
by: Wang, Zhenrong, et al.
Published: (2025)
End-to-End Spatial-Temporal Transformer for Real-time 4D HOI Reconstruction
by: Zhang, Haoyu, et al.
Published: (2026)
by: Zhang, Haoyu, et al.
Published: (2026)
Similar Items
-
Prompting Continual Person Search
by: Zhang, Pengcheng, et al.
Published: (2024) -
FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM
by: Wu, Yuchen, et al.
Published: (2025) -
Fully Unified Motion Planning for End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025) -
Drive-JEPA: Video JEPA Meets Multimodal Trajectory Distillation for End-to-End Driving
by: Wang, Linhan, et al.
Published: (2026) -
Bridging the Gap Between End-to-End and Two-Step Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)