Saved in:
| Main Authors: | Cai, Yusen, Lin, Qing, Nunna, Bhargava Satya, Zhang, Mengmi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.14440 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning
by: Shen, Yang, et al.
Published: (2026)
by: Shen, Yang, et al.
Published: (2026)
Seeing Through Uncertainty: A Free-Energy Approach for Real-Time Perceptual Adaptation in Robust Visual Navigation
by: Piriyajitakonkij, Maytus, et al.
Published: (2024)
by: Piriyajitakonkij, Maytus, et al.
Published: (2024)
Make Me Happier: Evoking Emotions Through Image Diffusion Models
by: Lin, Qing, et al.
Published: (2024)
by: Lin, Qing, et al.
Published: (2024)
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
by: Khandelwal, Naitik, et al.
Published: (2023)
by: Khandelwal, Naitik, et al.
Published: (2023)
Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
by: Wang, Bo, et al.
Published: (2024)
by: Wang, Bo, et al.
Published: (2024)
Learning to See the Elephant in the Room: Self-Supervised Context Reasoning in Humans and AI
by: Liu, Xiao, et al.
Published: (2022)
by: Liu, Xiao, et al.
Published: (2022)
Seeing Through Their Eyes: Evaluating Visual Perspective Taking in Vision Language Models
by: Góral, Gracjan, et al.
Published: (2024)
by: Góral, Gracjan, et al.
Published: (2024)
Seeing Through Smoke: Surgical Desmoking for Improved Visual Perception
by: Lu, Jingpei, et al.
Published: (2026)
by: Lu, Jingpei, et al.
Published: (2026)
Pose Prior Learner: Unsupervised Categorical Prior Learning for Pose Estimation
by: Wang, Ziyu, et al.
Published: (2024)
by: Wang, Ziyu, et al.
Published: (2024)
BabyVision: Visual Reasoning Beyond Language
by: Chen, Liang, et al.
Published: (2026)
by: Chen, Liang, et al.
Published: (2026)
Effective-LDAM: An Effective Loss Function To Mitigate Data Imbalance for Robust Chest X-Ray Disease Classification
by: S, Sree Rama Vamsidhar, et al.
Published: (2024)
by: S, Sree Rama Vamsidhar, et al.
Published: (2024)
Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI
by: Huang, Zheng, et al.
Published: (2025)
by: Huang, Zheng, et al.
Published: (2025)
When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?
by: Ye, Qilang, et al.
Published: (2025)
by: Ye, Qilang, et al.
Published: (2025)
Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs
by: Ranasinghe, Kanchana, et al.
Published: (2024)
by: Ranasinghe, Kanchana, et al.
Published: (2024)
Seeing Through Words: Controlling Visual Retrieval Quality with Language Models
by: Lu, Jianglin, et al.
Published: (2026)
by: Lu, Jianglin, et al.
Published: (2026)
Seeing Through Touch: Tactile-Driven Visual Localization of Material Regions
by: Kim, Seongyu, et al.
Published: (2026)
by: Kim, Seongyu, et al.
Published: (2026)
Eye-See-You: Reverse Pass-Through VR and Head Avatars
by: Dash, Ankan, et al.
Published: (2025)
by: Dash, Ankan, et al.
Published: (2025)
VisualActBench: Can VLMs See and Act like a Human?
by: Zhang, Daoan, et al.
Published: (2025)
by: Zhang, Daoan, et al.
Published: (2025)
Seeing What Matters: Visual Preference Policy Optimization for Visual Generation
by: Ni, Ziqi, et al.
Published: (2025)
by: Ni, Ziqi, et al.
Published: (2025)
Seeing the World through Your Eyes
by: Alzayer, Hadi, et al.
Published: (2023)
by: Alzayer, Hadi, et al.
Published: (2023)
PRISM: Progressive Reasoning through Iterative Slot Memory for Vision
by: Wang, Ziyu, et al.
Published: (2026)
by: Wang, Ziyu, et al.
Published: (2026)
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
by: Han, Shuangpeng, et al.
Published: (2024)
by: Han, Shuangpeng, et al.
Published: (2024)
MedEyes: Learning Dynamic Visual Focus for Medical Progressive Diagnosis
by: Zhu, Chunzheng, et al.
Published: (2025)
by: Zhu, Chunzheng, et al.
Published: (2025)
Machine Intelligence that Understands Visual and Linguistic Information and Interacts with Humans and Environments
by: Nguyen, Van Quang
Published: (2026)
by: Nguyen, Van Quang
Published: (2026)
Mimicking Human Visual Development for Learning Robust Image Representations
by: Raj, Ankita, et al.
Published: (2025)
by: Raj, Ankita, et al.
Published: (2025)
Seeing Eye to AI: Comparing Human Gaze and Model Attention in Video Memorability
by: Kumar, Prajneya, et al.
Published: (2023)
by: Kumar, Prajneya, et al.
Published: (2023)
Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction
by: Zhang, Zhengquan, et al.
Published: (2025)
by: Zhang, Zhengquan, et al.
Published: (2025)
Seeing Eye to AI? Applying Deep-Feature-Based Similarity Metrics to Information Visualization
by: Long, Sheng, et al.
Published: (2025)
by: Long, Sheng, et al.
Published: (2025)
Seeing Without Eyes: 4D Human-Scene Understanding from Wearable IMUs
by: Hsu, Hao-Yu, et al.
Published: (2026)
by: Hsu, Hao-Yu, et al.
Published: (2026)
Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation
by: Zhu, Junyu, et al.
Published: (2023)
by: Zhu, Junyu, et al.
Published: (2023)
AvatarShield: Visual Reinforcement Learning for Human-Centric Synthetic Video Detection
by: Xu, Zhipei, et al.
Published: (2025)
by: Xu, Zhipei, et al.
Published: (2025)
Unveiling the Tapestry: the Interplay of Generalization and Forgetting in Continual Learning
by: Shi, Zenglin, et al.
Published: (2022)
by: Shi, Zenglin, et al.
Published: (2022)
Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI models in Sound Localization
by: Jia, Yanhao, et al.
Published: (2025)
by: Jia, Yanhao, et al.
Published: (2025)
Seeing the Unseen: Visual Common Sense for Semantic Placement
by: Ramrakhya, Ram, et al.
Published: (2024)
by: Ramrakhya, Ram, et al.
Published: (2024)
Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs
by: Narnaware, Vishal, et al.
Published: (2026)
by: Narnaware, Vishal, et al.
Published: (2026)
Compression Tells Intelligence: Visual Coding, Visual Token Technology, and the Unification
by: Jin, Xin, et al.
Published: (2026)
by: Jin, Xin, et al.
Published: (2026)
Learning Only with Images: Visual Reinforcement Learning with Reasoning, Rendering, and Visual Feedback
by: Chen, Yang, et al.
Published: (2025)
by: Chen, Yang, et al.
Published: (2025)
Foraging with the Eyes: Dynamics in Human Visual Gaze and Deep Predictive Modeling
by: Panchagnula, Tejaswi V.
Published: (2025)
by: Panchagnula, Tejaswi V.
Published: (2025)
Seeing is Believing? Enhancing Vision-Language Navigation using Visual Perturbations
by: Zhang, Xuesong, et al.
Published: (2024)
by: Zhang, Xuesong, et al.
Published: (2024)
See Different, Think Better: Visual Variations Mitigating Hallucinations in LVLMs
by: Dai, Ziyun, et al.
Published: (2025)
by: Dai, Ziyun, et al.
Published: (2025)
Similar Items
-
Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning
by: Shen, Yang, et al.
Published: (2026) -
Seeing Through Uncertainty: A Free-Energy Approach for Real-Time Perceptual Adaptation in Robust Visual Navigation
by: Piriyajitakonkij, Maytus, et al.
Published: (2024) -
Make Me Happier: Evoking Emotions Through Image Diffusion Models
by: Lin, Qing, et al.
Published: (2024) -
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
by: Khandelwal, Naitik, et al.
Published: (2023) -
Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
by: Wang, Bo, et al.
Published: (2024)