Saved in:
| Main Authors: | Wang, Maijunxian, Wang, Ruisi, Lin, Juyi, Ji, Ran, Wiedemer, Thaddäus, Gao, Qingying, Luo, Dezhi, Qian, Yaoyao, Huang, Lianyu, Hong, Zelong, Ge, Jiahui, Ma, Qianli, He, Hang, Zhou, Yifan, Guo, Lingzi, Mei, Lantao, Li, Jiachen, Xing, Hanwen, Zhao, Tianqi, Yu, Fengyuan, Xiao, Weihang, Jiao, Yizheng, Hou, Jianheng, Zhang, Danyang, Xu, Pengcheng, Zhong, Boyang, Zhao, Zehong, Fang, Gaoyun, Kitaoka, John, Xu, Yile, Xu, Hua, Blacutt, Kenton, Nguyen, Tin, Song, Siyuan, Sun, Haoran, Wen, Shaoyue, He, Linyang, Wang, Runming, Wang, Yanzhi, Yang, Mengyue, Ma, Ziqiao, Millière, Raphaël, Shi, Freda, Vasconcelos, Nuno, Khashabi, Daniel, Yuille, Alan, Du, Yilun, Liu, Ziming, Li, Bo, Lin, Dahua, Liu, Ziwei, Kumar, Vikash, Li, Yijiang, Yang, Lei, Cai, Zhongang, Deng, Hokin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.20159 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Probing Perceptual Constancy in Large Vision-Language Models
by: Sun, Haoran, et al.
Published: (2025)
by: Sun, Haoran, et al.
Published: (2025)
Egocentric Bias in Vision-Language Models
by: Wang, Maijunxian, et al.
Published: (2026)
by: Wang, Maijunxian, et al.
Published: (2026)
Demystifying Video Reasoning
by: Wang, Ruisi, et al.
Published: (2026)
by: Wang, Ruisi, et al.
Published: (2026)
MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
by: Mayilvahanan, Prasanna, et al.
Published: (2025)
by: Mayilvahanan, Prasanna, et al.
Published: (2025)
The Quantified Body: Identity, Empowerment, and Control in Smart Wearables
by: Wang, Maijunxian
Published: (2025)
by: Wang, Maijunxian
Published: (2025)
From Understanding the World to Intervening in It: A Unified Multi-Scale Framework for Embodied Cognition
by: Wang, Maijunxian
Published: (2025)
by: Wang, Maijunxian
Published: (2025)
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models
by: Li, Tianjian, et al.
Published: (2023)
by: Li, Tianjian, et al.
Published: (2023)
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
by: Li, Tianjian, et al.
Published: (2024)
by: Li, Tianjian, et al.
Published: (2024)
Rethinking the Simulation vs. Rendering Dichotomy: No Free Lunch in Spatial World Modelling
by: Luo, Dezhi, et al.
Published: (2025)
by: Luo, Dezhi, et al.
Published: (2025)
Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?
by: Mayilvahanan, Prasanna, et al.
Published: (2023)
by: Mayilvahanan, Prasanna, et al.
Published: (2023)
LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws
by: Mayilvahanan, Prasanna, et al.
Published: (2025)
by: Mayilvahanan, Prasanna, et al.
Published: (2025)
Pretraining Frequency Predicts Compositional Generalization of CLIP on Real-World Tasks
by: Wiedemer, Thaddäus, et al.
Published: (2025)
by: Wiedemer, Thaddäus, et al.
Published: (2025)
Vision Language Models Cannot Reason About Physical Transformation
by: Luo, Dezhi, et al.
Published: (2026)
by: Luo, Dezhi, et al.
Published: (2026)
Increasing Computation Resolves Conflicts in Vision Language Models
by: Wang, Bingyang, et al.
Published: (2025)
by: Wang, Bingyang, et al.
Published: (2025)
Provable Compositional Generalization for Object-Centric Learning
by: Wiedemer, Thaddäus, et al.
Published: (2023)
by: Wiedemer, Thaddäus, et al.
Published: (2023)
AGI as Second Being: The Structural-Generative Ontology of Intelligence
by: Wang, Maijunxian, et al.
Published: (2025)
by: Wang, Maijunxian, et al.
Published: (2025)
Study on Improving Microwave Heating Uniformity Based on Phase-Frequency Simultaneous Modulation Technique
by: Zhu, Xu, et al.
Published: (2025)
by: Zhu, Xu, et al.
Published: (2025)
High-Efficiency Isolator-Free Magnetron Power Combining Method Based on H-Plane Tee Coupling and Peer-to-Peer Locking
by: Wang, Shaoyue, et al.
Published: (2025)
by: Wang, Shaoyue, et al.
Published: (2025)
La administración de las organizaciones de profesionales: una perspectiva neoclásica. A la memoria de Peter F. Drucker
by: Jorge Alejandro Blacutt
Published: (2010)
by: Jorge Alejandro Blacutt
Published: (2010)
La administración de las organizaciones de profesionales: una perspectiva neoclásica. A la memoria de Peter F. Drucker
by: Jorge Alejandro Blacutt
Published: (2010)
by: Jorge Alejandro Blacutt
Published: (2010)
VGGSounder: Audio-Visual Evaluations for Foundation Models
by: Zverev, Daniil, et al.
Published: (2025)
by: Zverev, Daniil, et al.
Published: (2025)
Vision Language Models See What You Want but not What You See
by: Gao, Qingying, et al.
Published: (2024)
by: Gao, Qingying, et al.
Published: (2024)
Vision Language Models Know Law of Conservation without Understanding More-or-Less
by: Luo, Dezhi, et al.
Published: (2024)
by: Luo, Dezhi, et al.
Published: (2024)
Probing Mechanical Reasoning in Large Vision Language Models
by: Sun, Haoran, et al.
Published: (2024)
by: Sun, Haoran, et al.
Published: (2024)
In Search of Forgotten Domain Generalization
by: Mayilvahanan, Prasanna, et al.
Published: (2024)
by: Mayilvahanan, Prasanna, et al.
Published: (2024)
Trustworthy Evaluation of Robotic Manipulation: A New Benchmark and AutoEval Methods
by: Liu, Mengyuan, et al.
Published: (2026)
by: Liu, Mengyuan, et al.
Published: (2026)
MentisOculi: Revealing the Limits of Reasoning with Mental Imagery
by: Zeller, Jana, et al.
Published: (2026)
by: Zeller, Jana, et al.
Published: (2026)
Video models are zero-shot learners and reasoners
by: Wiedemer, Thaddäus, et al.
Published: (2025)
by: Wiedemer, Thaddäus, et al.
Published: (2025)
Normative Conflicts and Shallow AI Alignment
by: Millière, Raphaël
Published: (2025)
by: Millière, Raphaël
Published: (2025)
Philosophy of Cognitive Science in the Age of Deep Learning
by: Millière, Raphaël
Published: (2024)
by: Millière, Raphaël
Published: (2024)
Language Models as Models of Language
by: Millière, Raphaël
Published: (2024)
by: Millière, Raphaël
Published: (2024)
Philosophy of cognitive science in the age of deep learning
by: Raphaël Millière
Published: (2024)
by: Raphaël Millière
Published: (2024)
MP1: MeanFlow Tames Policy Learning in 1-step for Robotic Manipulation
by: Sheng, Juyi, et al.
Published: (2025)
by: Sheng, Juyi, et al.
Published: (2025)
Vision-Language Models Mistake Head Orientation for Gaze Direction: Nonverbal Conversation Cues
by: Zhang, Zory, et al.
Published: (2025)
by: Zhang, Zory, et al.
Published: (2025)
An End-to-End Multi-objective Ensemble Ranking Framework for Video Recommendation
by: He, Tiantian, et al.
Published: (2025)
by: He, Tiantian, et al.
Published: (2025)
Video Models Start to Solve Chess, Maze, Sudoku, Mental Rotation, and Raven' Matrices
by: Deng, Hokin
Published: (2025)
by: Deng, Hokin
Published: (2025)
Logical forms complement probability in understanding language model (and human) performance
by: Wang, Yixuan, et al.
Published: (2025)
by: Wang, Yixuan, et al.
Published: (2025)
Layer-wise Minimal Pair Probing Reveals Contextual Grammatical-Conceptual Hierarchy in Speech Representations
by: He, Linyang, et al.
Published: (2025)
by: He, Linyang, et al.
Published: (2025)
Canonicity for Cost-Aware Logical Framework via Synthetic Tait Computability
by: Li, Runming, et al.
Published: (2025)
by: Li, Runming, et al.
Published: (2025)
Anisotropic Raman Mapping and Strain‐Enhanced Optoelectronic Performance for Few‐Layer Suspended Black Phosphorus (Advanced Optical Materials 3/2026)
by: Yuanqiang He, et al.
Published: (2026)
by: Yuanqiang He, et al.
Published: (2026)
Similar Items
-
Probing Perceptual Constancy in Large Vision-Language Models
by: Sun, Haoran, et al.
Published: (2025) -
Egocentric Bias in Vision-Language Models
by: Wang, Maijunxian, et al.
Published: (2026) -
Demystifying Video Reasoning
by: Wang, Ruisi, et al.
Published: (2026) -
MATH-Beyond: A Benchmark for RL to Expand Beyond the Base Model
by: Mayilvahanan, Prasanna, et al.
Published: (2025) -
The Quantified Body: Identity, Empowerment, and Control in Smart Wearables
by: Wang, Maijunxian
Published: (2025)