:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Aranjuelo, Nerea, Huang, Siyu, Arganda-Carreras, Ignacio, Unzueta, Luis, Otaegui, Oihana, Pfister, Hanspeter, Wei, Donglai
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computer Vision and Pattern Recognition Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2405.20643
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

A Fully Interpretable Statistical Approach for Roadside LiDAR Background Subtraction
von: Iglesias, Aitor, et al.
Veröffentlicht: (2025)

S$^3$-TTA: Scale-Style Selection for Test-Time Augmentation in Biomedical Image Segmentation
von: Xie, Kangxian, et al.
Veröffentlicht: (2023)

Tree of Attributes Prompt Learning for Vision-Language Models
von: Ding, Tong, et al.
Veröffentlicht: (2024)

Joint-Task Regularization for Partially Labeled Multi-Task Learning
von: Nishi, Kento, et al.
Veröffentlicht: (2024)

Improving generalization by mimicking the human visual diet
von: Madan, Spandan, et al.
Veröffentlicht: (2022)

When Visuals Aren't the Problem: Evaluating Vision-Language Models on Misleading Data Visualizations
von: Lalai, Harsh Nishant, et al.
Veröffentlicht: (2026)

DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models
von: Shi, Zhiyi, et al.
Veröffentlicht: (2025)

Understanding Graphical Perception in Data Visualization through Zero-shot Prompting of Vision-Language Models
von: Guo, Grace, et al.
Veröffentlicht: (2024)

AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
von: Xu, Tianling, et al.
Veröffentlicht: (2025)

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion
von: He, Jixuan, et al.
Veröffentlicht: (2024)

SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization
von: Li, Wanhua, et al.
Veröffentlicht: (2024)

GazeMoE: Perception of Gaze Target with Mixture-of-Experts
von: Dai, Zhuangzhuang, et al.
Veröffentlicht: (2026)

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
von: Liu, Ye, et al.
Veröffentlicht: (2024)

Exploration of VLMs for Driver Monitoring Systems Applications
von: Cañas, Paola Natalia, et al.
Veröffentlicht: (2025)

Controlling Face's Frame generation in StyleGAN's latent space operations: Modifying faces to deceive our memory
von: Roca, Agustín, et al.
Veröffentlicht: (2024)

Seeing Like Radiologists: Context- and Gaze-Guided Vision-Language Pretraining for Chest X-rays
von: Liu, Kang, et al.
Veröffentlicht: (2026)

Visual Acoustic Fields
von: Li, Yuelei, et al.
Veröffentlicht: (2025)

Generalization of CNNs on Relational Reasoning with Bar Charts
von: Cui, Zhenxing, et al.
Veröffentlicht: (2025)

Gaze-VLM:Bridging Gaze and VLMs through Attention Regularization for Egocentric Understanding
von: Pani, Anupam, et al.
Veröffentlicht: (2025)

GazeVLM: A Vision-Language Model for Multi-Task Gaze Understanding
von: Mathew, Athul M., et al.
Veröffentlicht: (2025)

GazeQwen: Lightweight Gaze-Conditioned LLM Modulation for Streaming Video Understanding
von: Pham, Trong Thang, et al.
Veröffentlicht: (2026)

See Through the Noise: Improving Domain Generalization in Gaze Estimation
von: Peng, Yanming, et al.
Veröffentlicht: (2026)

TPP-Gaze: Modelling Gaze Dynamics in Space and Time with Neural Temporal Point Processes
von: D'Amelio, Alessandro, et al.
Veröffentlicht: (2024)

PGcGAN: Pathological Gait-Conditioned GAN for Human Gait Synthesis
von: Chandrasekaran, Mritula, et al.
Veröffentlicht: (2026)

Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pairs
von: Xin, Bowen, et al.
Veröffentlicht: (2024)

RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion
von: Wang, Haowen, et al.
Veröffentlicht: (2023)

Toddlers' Active Gaze Behavior Supports Self-Supervised Object Learning
von: Yu, Zhengyang, et al.
Veröffentlicht: (2024)

GazeFormer-MoE: Context-Aware Gaze Estimation via CLIP and MoE Transformer
von: Zhao, Xinyuan, et al.
Veröffentlicht: (2026)

Glance-or-Gaze: Incentivizing LMMs to Adaptively Focus Search via Reinforcement Learning
von: Bai, Hongbo, et al.
Veröffentlicht: (2026)

CTRL-GS: Cascaded Temporal Residue Learning for 4D Gaussian Splatting
von: Hou, Karly, et al.
Veröffentlicht: (2025)

RiGS: Rigid-aware 4D Gaussian Splatting from a Single Monocular Video
von: Wu, Chenyu, et al.
Veröffentlicht: (2026)

GazeSearch: Radiology Findings Search Benchmark
von: Pham, Trong Thang, et al.
Veröffentlicht: (2024)

Inference-based GAN Video Generation
von: Yang, Jingbo, et al.
Veröffentlicht: (2025)

Gaze on the Prize: Shaping Visual Attention with Return-Guided Contrastive Learning
von: Lee, Andrew, et al.
Veröffentlicht: (2025)

From Scene to Object: Text-Guided Dual-Gaze Prediction
von: Ke, Zehong, et al.
Veröffentlicht: (2026)

Watch and Learn: Learning to Use Computers from Online Videos
von: Song, Chan Hee, et al.
Veröffentlicht: (2025)

RGBD Gaze Tracking Using Transformer for Feature Fusion
von: Bauer, Tobias J.
Veröffentlicht: (2025)

Weakly-supervised Medical Image Segmentation with Gaze Annotations
von: Zhong, Yuan, et al.
Veröffentlicht: (2024)

3D Gaussian and Diffusion-Based Gaze Redirection
von: Panchalingam, Abiram, et al.
Veröffentlicht: (2025)

Is Geometry Enough? An Evaluation of Landmark-Based Gaze Estimation
von: Agostinelli, Daniele, et al.
Veröffentlicht: (2026)