:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Pengcheng, Bai, Xiao, Zheng, Jin, Ning, Xin
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2309.04967
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Prompting Continual Person Search
by: Zhang, Pengcheng, et al.
Published: (2024)

FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM
by: Wu, Yuchen, et al.
Published: (2025)

Fully Unified Motion Planning for End-to-End Autonomous Driving
by: Liu, Lin, et al.
Published: (2025)

Drive-JEPA: Video JEPA Meets Multimodal Trajectory Distillation for End-to-End Driving
by: Wang, Linhan, et al.
Published: (2026)

Bridging the Gap Between End-to-End and Two-Step Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)

Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
by: Zhang, Bozhou, et al.
Published: (2025)

InterMesh: Explicit Interaction-Aware End-to-End Multi-Person Human Mesh Recovery
by: Zheng, Kaili, et al.
Published: (2026)

An End-to-End Framework for Video Multi-Person Pose Estimation
by: Wei, Zhihong
Published: (2025)

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
by: Ma, Zehong, et al.
Published: (2025)

HAD: Combining Hierarchical Diffusion with Metric-Decoupled RL for End-to-End Driving
by: Yao, Wenhao, et al.
Published: (2026)

OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
by: Zhong, Yufeng, et al.
Published: (2026)

DeepFRC: An End-to-End Deep Learning Model for Functional Registration and Classification
by: Jiang, Siyuan, et al.
Published: (2025)

End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning
by: Zheng, Qiaoyu, et al.
Published: (2025)

GenAD: Generative End-to-End Autonomous Driving
by: Zheng, Wenzhao, et al.
Published: (2024)

Driving with A Thousand Faces: A Benchmark for Closed-Loop Personalized End-to-End Autonomous Driving
by: Dong, Xiaoru, et al.
Published: (2026)

OneVision: An End-to-End Generative Framework for Multi-view E-commerce Vision Search
by: Zheng, Zexin, et al.
Published: (2025)

End-to-End Multi-Person Pose Estimation with Pose-Aware Video Transformer
by: Yu, Yonghui, et al.
Published: (2025)

A Paradigm Shift: Fully End-to-End Training for Temporal Sentence Grounding in Videos
by: He, Allen, et al.
Published: (2026)

E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition
by: Zhang, Meng, et al.
Published: (2026)

RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution
by: Jian, Siyong, et al.
Published: (2026)

Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage
by: Sun, Zhengwentai, et al.
Published: (2025)

Towards Collaborative Autonomous Driving: Simulation Platform and End-to-End System
by: Liu, Genjia, et al.
Published: (2024)

Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving
by: Yang, Jiawei, et al.
Published: (2025)

E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
by: Zhang, Jiaqing, et al.
Published: (2024)

Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models
by: Cheng, Hao, et al.
Published: (2024)

VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs
by: Zhu, Jiaying, et al.
Published: (2025)

Referring Expression Instance Retrieval and A Strong End-to-End Baseline
by: Hao, Xiangzhao, et al.
Published: (2025)

SimFlow: Simplified and End-to-End Training of Latent Normalizing Flows
by: Zhao, Qinyu, et al.
Published: (2025)

OED: Towards One-stage End-to-End Dynamic Scene Graph Generation
by: Wang, Guan, et al.
Published: (2024)

Leveraging Image Matching Toward End-to-End Relative Camera Pose Regression
by: Khatib, Fadi, et al.
Published: (2022)

EVE: Towards End-to-End Video Subtitle Extraction with Vision-Language Models
by: Yu, Haiyang, et al.
Published: (2025)

SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation
by: Sun, Wenchao, et al.
Published: (2024)

End4: End-to-end Denoising Diffusion for Diffusion-Based Inpainting Detection
by: Wang, Fei, et al.
Published: (2025)

Decoupling Scene Perception and Ego Status: A Multi-Context Fusion Approach for Enhanced Generalization in End-to-End Autonomous Driving
by: Tang, Jiacheng, et al.
Published: (2025)

LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification
by: Cai, Xin, et al.
Published: (2024)

End-to-End Vision Tokenizer Tuning
by: Wang, Wenxuan, et al.
Published: (2025)

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
by: Gong, Kehong, et al.
Published: (2026)

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos
by: Qiao, Tanqiu, et al.
Published: (2024)

End-to-End HOI Reconstruction Transformer with Graph-based Encoding
by: Wang, Zhenrong, et al.
Published: (2025)

End-to-End Spatial-Temporal Transformer for Real-time 4D HOI Reconstruction
by: Zhang, Haoyu, et al.
Published: (2026)