:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Maijunxian, Wang, Ruisi, Lin, Juyi, Ji, Ran, Wiedemer, Thaddäus, Gao, Qingying, Luo, Dezhi, Qian, Yaoyao, Huang, Lianyu, Hong, Zelong, Ge, Jiahui, Ma, Qianli, He, Hang, Zhou, Yifan, Guo, Lingzi, Mei, Lantao, Li, Jiachen, Xing, Hanwen, Zhao, Tianqi, Yu, Fengyuan, Xiao, Weihang, Jiao, Yizheng, Hou, Jianheng, Zhang, Danyang, Xu, Pengcheng, Zhong, Boyang, Zhao, Zehong, Fang, Gaoyun, Kitaoka, John, Xu, Yile, Xu, Hua, Blacutt, Kenton, Nguyen, Tin, Song, Siyuan, Sun, Haoran, Wen, Shaoyue, He, Linyang, Wang, Runming, Wang, Yanzhi, Yang, Mengyue, Ma, Ziqiao, Millière, Raphaël, Shi, Freda, Vasconcelos, Nuno, Khashabi, Daniel, Yuille, Alan, Du, Yilun, Liu, Ziming, Li, Bo, Lin, Dahua, Liu, Ziwei, Kumar, Vikash, Li, Yijiang, Yang, Lei, Cai, Zhongang, Deng, Hokin
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Multimedia Robotics
Online Access:	https://arxiv.org/abs/2602.20159
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Probing Perceptual Constancy in Large Vision-Language Models
by: Sun, Haoran, et al.
Published: (2025)

Egocentric Bias in Vision-Language Models
by: Wang, Maijunxian, et al.
Published: (2026)

Demystifying Video Reasoning
by: Wang, Ruisi, et al.
Published: (2026)

MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
by: Mayilvahanan, Prasanna, et al.
Published: (2025)

The Quantified Body: Identity, Empowerment, and Control in Smart Wearables
by: Wang, Maijunxian
Published: (2025)

From Understanding the World to Intervening in It: A Unified Multi-Scale Framework for Embodied Cognition
by: Wang, Maijunxian
Published: (2025)

Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
by: Li, Tianjian, et al.
Published: (2023)

Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
by: Li, Tianjian, et al.
Published: (2024)

Rethinking the Simulation vs. Rendering Dichotomy: No Free Lunch in Spatial World Modelling
by: Luo, Dezhi, et al.
Published: (2025)

Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?
by: Mayilvahanan, Prasanna, et al.
Published: (2023)

LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws
by: Mayilvahanan, Prasanna, et al.
Published: (2025)

Pretraining Frequency Predicts Compositional Generalization of CLIP on Real-World Tasks
by: Wiedemer, Thaddäus, et al.
Published: (2025)

Vision Language Models Cannot Reason About Physical Transformation
by: Luo, Dezhi, et al.
Published: (2026)

Increasing Computation Resolves Conflicts in Vision Language Models
by: Wang, Bingyang, et al.
Published: (2025)

Provable Compositional Generalization for Object-Centric Learning
by: Wiedemer, Thaddäus, et al.
Published: (2023)

AGI as Second Being: The Structural-Generative Ontology of Intelligence
by: Wang, Maijunxian, et al.
Published: (2025)

Study on Improving Microwave Heating Uniformity Based on Phase-Frequency Simultaneous Modulation Technique
by: Zhu, Xu, et al.
Published: (2025)

High-Efficiency Isolator-Free Magnetron Power Combining Method Based on H-Plane Tee Coupling and Peer-to-Peer Locking
by: Wang, Shaoyue, et al.
Published: (2025)

La administración de las organizaciones de profesionales: una perspectiva neoclásica. A la memoria de Peter F. Drucker
by: Jorge Alejandro Blacutt
Published: (2010)

La administración de las organizaciones de profesionales: una perspectiva neoclásica. A la memoria de Peter F. Drucker
by: Jorge Alejandro Blacutt
Published: (2010)

VGGSounder: Audio-Visual Evaluations for Foundation Models
by: Zverev, Daniil, et al.
Published: (2025)

Vision Language Models See What You Want but not What You See
by: Gao, Qingying, et al.
Published: (2024)

Vision Language Models Know Law of Conservation without Understanding More-or-Less
by: Luo, Dezhi, et al.
Published: (2024)

Probing Mechanical Reasoning in Large Vision Language Models
by: Sun, Haoran, et al.
Published: (2024)

In Search of Forgotten Domain Generalization
by: Mayilvahanan, Prasanna, et al.
Published: (2024)

Trustworthy Evaluation of Robotic Manipulation: A New Benchmark and AutoEval Methods
by: Liu, Mengyuan, et al.
Published: (2026)

MentisOculi: Revealing the Limits of Reasoning with Mental Imagery
by: Zeller, Jana, et al.
Published: (2026)

Video models are zero-shot learners and reasoners
by: Wiedemer, Thaddäus, et al.
Published: (2025)

Normative Conflicts and Shallow AI Alignment
by: Millière, Raphaël
Published: (2025)

Philosophy of Cognitive Science in the Age of Deep Learning
by: Millière, Raphaël
Published: (2024)

Language Models as Models of Language
by: Millière, Raphaël
Published: (2024)

Philosophy of cognitive science in the age of deep learning
by: Raphaël Millière
Published: (2024)

MP1: MeanFlow Tames Policy Learning in 1-step for Robotic Manipulation
by: Sheng, Juyi, et al.
Published: (2025)

Vision-Language Models Mistake Head Orientation for Gaze Direction: Nonverbal Conversation Cues
by: Zhang, Zory, et al.
Published: (2025)

An End-to-End Multi-objective Ensemble Ranking Framework for Video Recommendation
by: He, Tiantian, et al.
Published: (2025)

Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
by: Deng, Hokin
Published: (2025)

Logical forms complement probability in understanding language model (and human) performance
by: Wang, Yixuan, et al.
Published: (2025)

Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
by: He, Linyang, et al.
Published: (2025)

Canonicity for Cost-Aware Logical Framework via Synthetic Tait Computability
by: Li, Runming, et al.
Published: (2025)

Anisotropic Raman Mapping and Strain‐Enhanced Optoelectronic Performance for Few‐Layer Suspended Black Phosphorus (Advanced Optical Materials 3/2026)
by: Yuanqiang He, et al.
Published: (2026)