:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cai, Yusen, Lin, Qing, Nunna, Bhargava Satya, Zhang, Mengmi
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.14440
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning
by: Shen, Yang, et al.
Published: (2026)

Seeing Through Uncertainty: A Free-Energy Approach for Real-Time Perceptual Adaptation in Robust Visual Navigation
by: Piriyajitakonkij, Maytus, et al.
Published: (2024)

Make Me Happier: Evoking Emotions Through Image Diffusion Models
by: Lin, Qing, et al.
Published: (2024)

Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
by: Khandelwal, Naitik, et al.
Published: (2023)

Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
by: Wang, Bo, et al.
Published: (2024)

Learning to See the Elephant in the Room: Self-Supervised Context Reasoning in Humans and AI
by: Liu, Xiao, et al.
Published: (2022)

Seeing Through Their Eyes: Evaluating Visual Perspective Taking in Vision Language Models
by: Góral, Gracjan, et al.
Published: (2024)

Seeing Through Smoke: Surgical Desmoking for Improved Visual Perception
by: Lu, Jingpei, et al.
Published: (2026)

Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation
by: Wang, Ziyu, et al.
Published: (2024)

BabyVision: Visual Reasoning Beyond Language
by: Chen, Liang, et al.
Published: (2026)

Effective-LDAM: An Effective Loss Function To Mitigate Data Imbalance for Robust Chest X-Ray Disease Classification
by: S, Sree Rama Vamsidhar, et al.
Published: (2024)

Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI
by: Huang, Zheng, et al.
Published: (2025)

When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?
by: Ye, Qilang, et al.
Published: (2025)

Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
by: Ranasinghe, Kanchana, et al.
Published: (2024)

Seeing Through Words: Controlling Visual Retrieval Quality with Language Models
by: Lu, Jianglin, et al.
Published: (2026)

Seeing Through Touch: Tactile-Driven Visual Localization of Material Regions
by: Kim, Seongyu, et al.
Published: (2026)

Eye-See-You: Reverse Pass-Through VR and Head Avatars
by: Dash, Ankan, et al.
Published: (2025)

VisualActBench: Can VLMs See and Act like a Human?
by: Zhang, Daoan, et al.
Published: (2025)

Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
by: Ni, Ziqi, et al.
Published: (2025)

Seeing the World through Your Eyes
by: Alzayer, Hadi, et al.
Published: (2023)

PRISM: Progressive Reasoning through Iterative Slot Memory for Vision
by: Wang, Ziyu, et al.
Published: (2026)

Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
by: Han, Shuangpeng, et al.
Published: (2024)

MedEyes: Learning Dynamic Visual Focus for Medical Progressive Diagnosis
by: Zhu, Chunzheng, et al.
Published: (2025)

Machine Intelligence that Understands Visual and Linguistic Information and Interacts with Humans and Environments
by: Nguyen, Van Quang
Published: (2026)

Mimicking Human Visual Development for Learning Robust Image Representations
by: Raj, Ankita, et al.
Published: (2025)

Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability
by: Kumar, Prajneya, et al.
Published: (2023)

Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction
by: Zhang, Zhengquan, et al.
Published: (2025)

Seeing Eye to AI? Applying Deep-Feature-Based Similarity Metrics to Information Visualization
by: Long, Sheng, et al.
Published: (2025)

Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs
by: Hsu, Hao-Yu, et al.
Published: (2026)

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation
by: Zhu, Junyu, et al.
Published: (2023)

AvatarShield: Visual Reinforcement Learning for Human-Centric Synthetic Video Detection
by: Xu, Zhipei, et al.
Published: (2025)

Unveiling the Tapestry: the Interplay of Generalization and Forgetting in Continual Learning
by: Shi, Zenglin, et al.
Published: (2022)

Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization
by: Jia, Yanhao, et al.
Published: (2025)

Seeing the Unseen: Visual Common Sense for Semantic Placement
by: Ramrakhya, Ram, et al.
Published: (2024)

Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs
by: Narnaware, Vishal, et al.
Published: (2026)

Compression Tells Intelligence: Visual Coding, Visual Token Technology, and the Unification
by: Jin, Xin, et al.
Published: (2026)

Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
by: Chen, Yang, et al.
Published: (2025)

Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling
by: Panchagnula, Tejaswi V.
Published: (2025)

Seeing is Believing? Enhancing Vision-Language Navigation using Visual Perturbations
by: Zhang, Xuesong, et al.
Published: (2024)

See Different, Think Better: Visual Variations Mitigating Hallucinations in LVLMs
by: Dai, Ziyun, et al.
Published: (2025)