:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Guoyang, Ma, Fulong, Qi, Weiqing, Zhang, Chenguang, Liu, Yuxuan, Liu, Ming, Ma, Jun
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2409.15077
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment
by: Zhao, Guoyang, et al.
Published: (2026)

CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation
by: Zhao, Guoyang, et al.
Published: (2024)

FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera
by: Zhao, Guoyang, et al.
Published: (2024)

Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model
by: Ma, Fulong, et al.
Published: (2024)

Task-Oriented Pre-Training for Drivable Area Detection
by: Ma, Fulong, et al.
Published: (2024)

CLRKDNet: Speeding up Lane Detection with Knowledge Distillation
by: Qi, Weiqing, et al.
Published: (2024)

Monocular 3D lane detection for Autonomous Driving: Recent Achievements, Challenges, and Outlooks
by: Ma, Fulong, et al.
Published: (2024)

Every Dataset Counts: Scaling up Monocular 3D Object Detection with Joint Datasets Training
by: Ma, Fulong, et al.
Published: (2023)

Annotation-Free Curb Detection Leveraging Altitude Difference Image
by: Ma, Fulong, et al.
Published: (2024)

Decision-Driven Semantic Object Exploration for Legged Robots via Confidence-Calibrated Perception and Topological Subgoal Selection
by: Zhao, Guoyang, et al.
Published: (2025)

Annotation-Free Detection of Drivable Areas and Curbs Leveraging LiDAR Point Cloud Maps
by: Ma, Fulong, et al.
Published: (2026)

Structured Observation Language for Efficient and Generalizable Vision-Language Navigation
by: Peng, Daojie, et al.
Published: (2026)

CLIP-SLA: Parameter-Efficient CLIP Adaptation for Continuous Sign Language Recognition
by: Alyami, Sarah, et al.
Published: (2025)

Cross-domain Multi-step Thinking: Zero-shot Fine-grained Traffic Sign Recognition in the Wild
by: Gan, Yaozong, et al.
Published: (2024)

Meta CLIP 2: A Worldwide Scaling Recipe
by: Chuang, Yung-Sung, et al.
Published: (2025)

Hierarchically Decoupled Mixture-of-Experts for Robust Traffic Sign Recognition in Complex Driving Scenarios
by: Wang, Mingxiao, et al.
Published: (2026)

IAF-Net: Illumination-Adaptive Fusion for Low-Light Urban Road Segmentation
by: Wang, Bingtao, et al.
Published: (2026)

DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving
by: Wang, Sheng, et al.
Published: (2024)

T2I-Based Physical-World Appearance Attack against Traffic Sign Recognition Systems in Autonomous Driving
by: Ma, Chen, et al.
Published: (2025)

CIBR: Cross-modal Information Bottleneck Regularization for Robust CLIP Generalization
by: Ji, Yingrui, et al.
Published: (2025)

LiteViLNet: Lightweight Vision-LiDAR Fusion Network for Efficient Road Segmentation
by: Peng, Daojie, et al.
Published: (2026)

Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation
by: Guan, Mo, et al.
Published: (2024)

MoCLIP: Motion-Aware Fine-Tuning and Distillation of CLIP for Human Motion Generation
by: Maldonado, Gabriel, et al.
Published: (2025)

Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition
by: Gan, Yaozong, et al.
Published: (2024)

Revolutionizing Traffic Sign Recognition: Unveiling the Potential of Vision Transformers
by: Mingwin, Susano, et al.
Published: (2024)

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition
by: Lin, Kun-Yu, et al.
Published: (2024)

Towards an Incremental Unified Multimodal Anomaly Detection: Augmenting Multimodal Denoising From an Information Bottleneck Perspective
by: Long, Kaifang, et al.
Published: (2026)

PE-CLIP: A Parameter-Efficient Fine-Tuning of Vision Language Models for Dynamic Facial Expression Recognition
by: Saadi, Ibtissam, et al.
Published: (2025)

On the Suitability of Reinforcement Fine-Tuning to Visual Tasks
by: Chen, Xiaxu, et al.
Published: (2025)

Enhancing Traffic Sign Recognition On The Performance Based On Yolov8
by: Ibrahim, Baba, et al.
Published: (2025)

GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)

Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition
by: Liu, Zhendong, et al.
Published: (2024)

FIGhost: Fluorescent Ink-based Stealthy and Flexible Backdoor Attacks on Physical Traffic Sign Recognition
by: Yuan, Shuai, et al.
Published: (2025)

Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning
by: Peleg, Amit, et al.
Published: (2025)

Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective
by: Long, Kaifang, et al.
Published: (2024)

Road Traffic Sign Recognition method using Siamese network Combining Efficient-CNN based Encoder
by: Xi, Zhenghao, et al.
Published: (2025)

The CLIP Model is Secretly an Image-to-Prompt Converter
by: Ding, Yuxuan, et al.
Published: (2023)

MV-CLIP: Multi-View CLIP for Zero-shot 3D Shape Recognition
by: Song, Dan, et al.
Published: (2023)

Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning
by: Xu, Hai-Ming, et al.
Published: (2024)

Enhancing LLM-based Autonomous Driving with Modular Traffic Light and Sign Recognition
by: Schmidt, Fabian, et al.
Published: (2025)