Saved in:
| Main Authors: | Cheng, Yihua, Zhu, Yaning, Wang, Zongji, Hao, Hongquan, Liu, Yongwei, Cheng, Shiqing, Wang, Xi, Chang, Hyung Jin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.15664 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VL4Gaze: Unleashing Vision-Language Models for Gaze Following
by: Wang, Shijing, et al.
Published: (2025)
by: Wang, Shijing, et al.
Published: (2025)
TextGaze: Gaze-Controllable Face Generation with Natural Language
by: Wang, Hengfei, et al.
Published: (2024)
by: Wang, Hengfei, et al.
Published: (2024)
Enhancing Gaze Reasoning in Vision Foundation Models for Gaze Following
by: Wang, Shijing, et al.
Published: (2026)
by: Wang, Shijing, et al.
Published: (2026)
3D Prior is All You Need: Cross-Task Few-shot 2D Gaze Estimation
by: Cheng, Yihua, et al.
Published: (2025)
by: Cheng, Yihua, et al.
Published: (2025)
RTGaze: Real-Time 3D-Aware Gaze Redirection from a Single Image
by: Wang, Hengfei, et al.
Published: (2025)
by: Wang, Hengfei, et al.
Published: (2025)
Multi-Modal Gaze Following in Conversational Scenarios
by: Hou, Yuqi, et al.
Published: (2023)
by: Hou, Yuqi, et al.
Published: (2023)
Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark
by: Cheng, Yihua, et al.
Published: (2021)
by: Cheng, Yihua, et al.
Published: (2021)
Learning to See What You Need: Gaze Attention for Multimodal Large Language Models
by: Song, Junha, et al.
Published: (2026)
by: Song, Junha, et al.
Published: (2026)
NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model
by: Zhang, Zhongqun, et al.
Published: (2024)
by: Zhang, Zhongqun, et al.
Published: (2024)
Multi-task Gaze Estimation Via Unidirectional Convolution
by: Cheng, Zhang, et al.
Published: (2024)
by: Cheng, Zhang, et al.
Published: (2024)
Lightweight Gaze Estimation Model Via Fusion Global Information
by: Cheng, Zhang, et al.
Published: (2024)
by: Cheng, Zhang, et al.
Published: (2024)
See Through the Noise: Improving Domain Generalization in Gaze Estimation
by: Peng, Yanming, et al.
Published: (2026)
by: Peng, Yanming, et al.
Published: (2026)
Differential Contrastive Training for Gaze Estimation
by: Zhang, Lin, et al.
Published: (2025)
by: Zhang, Lin, et al.
Published: (2025)
EM-Net: Gaze Estimation with Expectation Maximization Algorithm
by: Cheng, Zhang, et al.
Published: (2024)
by: Cheng, Zhang, et al.
Published: (2024)
Distributed Real-Time Vehicle Control for Emergency Vehicle Transit: A Scalable Cooperative Method
by: Wang, WenXi, et al.
Published: (2026)
by: Wang, WenXi, et al.
Published: (2026)
Bidirectional Regression for Monocular 6DoF Head Pose Estimation and Reference System Alignment
by: Chun, Sungho, et al.
Published: (2024)
by: Chun, Sungho, et al.
Published: (2024)
Roll Your Eyes: Gaze Redirection via Explicit 3D Eyeball Rotation
by: Choi, YoungChan, et al.
Published: (2025)
by: Choi, YoungChan, et al.
Published: (2025)
What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality
by: Wang, Zihan, et al.
Published: (2024)
by: Wang, Zihan, et al.
Published: (2024)
GazeCLIP: Enhancing Gaze Estimation Through Text-Guided Multimodal Learning
by: Wang, Jun, et al.
Published: (2023)
by: Wang, Jun, et al.
Published: (2023)
What You See is What You Ask: Evaluating Audio Descriptions
by: Kala, Divy, et al.
Published: (2025)
by: Kala, Divy, et al.
Published: (2025)
What You See is What You Classify: Black Box Attributions
by: Stalder, Steven, et al.
Published: (2022)
by: Stalder, Steven, et al.
Published: (2022)
Efficient Vision-based Vehicle Speed Estimation
by: Macko, Andrej, et al.
Published: (2025)
by: Macko, Andrej, et al.
Published: (2025)
Neuro-Cognitive Reward Modeling for Human-Centered Autonomous Vehicle Control
by: Zhuang, Zhuoli, et al.
Published: (2026)
by: Zhuang, Zhuoli, et al.
Published: (2026)
GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)
by: Zhang, Yaning, et al.
Published: (2026)
Gaze Label Alignment: Alleviating Domain Shift for Gaze Estimation
by: Zeng, Guanzhong, et al.
Published: (2024)
by: Zeng, Guanzhong, et al.
Published: (2024)
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
by: Choi, Yura, et al.
Published: (2026)
by: Choi, Yura, et al.
Published: (2026)
Do You See What I Say? Generalizable Deepfake Detection based on Visual Speech Recognition
by: Bora, Maheswar, et al.
Published: (2025)
by: Bora, Maheswar, et al.
Published: (2025)
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
by: Abdelhamed, Abdelrahman, et al.
Published: (2024)
by: Abdelhamed, Abdelrahman, et al.
Published: (2024)
CVVLSNet: Vehicle Location and Speed Estimation Using Partial Connected Vehicle Trajectory Data
by: Ye, Jiachen, et al.
Published: (2024)
by: Ye, Jiachen, et al.
Published: (2024)
See Where You Read with Eye Gaze Tracking and Large Language Model
by: Yang, Sikai, et al.
Published: (2024)
by: Yang, Sikai, et al.
Published: (2024)
LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation
by: Yin, Pengwei, et al.
Published: (2024)
by: Yin, Pengwei, et al.
Published: (2024)
What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits
by: Manogaran, Harish Babu, et al.
Published: (2024)
by: Manogaran, Harish Babu, et al.
Published: (2024)
Do You See What I See? A Qualitative Study Eliciting High-Level Visualization Comprehension
by: Quadri, Ghulam Jilani, et al.
Published: (2024)
by: Quadri, Ghulam Jilani, et al.
Published: (2024)
Cross-Paradigm Evaluation of Gaze-Based Semantic Object Identification for Intelligent Vehicles
by: Deng, Penghao, et al.
Published: (2026)
by: Deng, Penghao, et al.
Published: (2026)
Spatially Selective Imaging in Color: What You See is What You Want
by: John You En Chan, et al.
Published: (2024)
by: John You En Chan, et al.
Published: (2024)
Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM
by: Huang, Yizhou, et al.
Published: (2025)
by: Huang, Yizhou, et al.
Published: (2025)
Instruction-Grounded Visual Projectors for Continual Learning of Generative Vision-Language Models
by: Jin, Hyundong, et al.
Published: (2025)
by: Jin, Hyundong, et al.
Published: (2025)
Cross-Vehicle 3D Geometric Consistency for Self-Supervised Surround Depth Estimation on Articulated Vehicles
by: Liu, Weimin, et al.
Published: (2026)
by: Liu, Weimin, et al.
Published: (2026)
Revisit What You See: Revealing Visual Semantics in Vision Tokens to Guide LVLM Decoding
by: Cho, Beomsik, et al.
Published: (2025)
by: Cho, Beomsik, et al.
Published: (2025)
What You See is (Usually) What You Get: Multimodal Prototype Networks that Abstain from Expensive Modalities
by: Bahng, Muchang, et al.
Published: (2025)
by: Bahng, Muchang, et al.
Published: (2025)
Similar Items
-
VL4Gaze: Unleashing Vision-Language Models for Gaze Following
by: Wang, Shijing, et al.
Published: (2025) -
TextGaze: Gaze-Controllable Face Generation with Natural Language
by: Wang, Hengfei, et al.
Published: (2024) -
Enhancing Gaze Reasoning in Vision Foundation Models for Gaze Following
by: Wang, Shijing, et al.
Published: (2026) -
3D Prior is All You Need: Cross-Task Few-shot 2D Gaze Estimation
by: Cheng, Yihua, et al.
Published: (2025) -
RTGaze: Real-Time 3D-Aware Gaze Redirection from a Single Image
by: Wang, Hengfei, et al.
Published: (2025)