:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Basoc, Nicoleta-Nina, Cosma, Adrian, Radoi, Emilian
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.06141
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Database-Agnostic Gait Enrollment using SetTransformers
by: Basoc, Nicoleta, et al.
Published: (2025)

On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition
by: Cosma, Adrian, et al.
Published: (2025)

The Paradox of Motion: Evidence for Spurious Correlations in Skeleton-based Gait Recognition Models
by: Cătrună, Andy, et al.
Published: (2024)

MoME: Estimating Psychological Traits from Gait with Multi-Stage Mixture of Movement Experts
by: Cǎtrunǎ, Andy, et al.
Published: (2025)

CrossGaze: A Strong Method for 3D Gaze Estimation in the Wild
by: Cătrună, Andy, et al.
Published: (2024)

GaitPT: Skeletons Are All You Need For Gait Recognition
by: Catruna, Andy, et al.
Published: (2023)

Aligning Actions and Walking to LLM-Generated Textual Descriptions
by: Chivereanu, Radu, et al.
Published: (2024)

Gait Recognition from Highly Compressed Videos
by: Niculae, Andrei, et al.
Published: (2024)

What Makes a Good Doctor Response? A Study on Text-Based Telemedicine
by: Cosma, Adrian, et al.
Published: (2026)

A Retrieval-Based Approach to Medical Procedure Matching in Romanian
by: Niculae, Andrei, et al.
Published: (2025)

Training Language Models with homotokens Leads to Delayed Overfitting
by: Cosma, Adrian, et al.
Published: (2026)

The Strawberry Problem: Emergence of Character-level Understanding in Tokenized Language Models
by: Cosma, Adrian, et al.
Published: (2025)

The Illusion-Illusion: Vision Language Models See Illusions Where There are None
by: Ullman, Tomer
Published: (2024)

Illusion-Aware Visual Preprocessing and Anti-Illusion Prompting for Classic Illusion Understanding in Vision-Language Models
by: Zha, Junli, et al.
Published: (2026)

IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models
by: Shahgir, Haz Sameen, et al.
Published: (2024)

IllusionBench+: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models
by: Zhang, Yiming, et al.
Published: (2025)

RoMath: A Mathematical Reasoning Benchmark in Romanian
by: Cosma, Adrian, et al.
Published: (2024)

The Illusion of Progress? A Critical Look at Test-Time Adaptation for Vision-Language Models
by: Sheng, Lijun, et al.
Published: (2025)

Evaluating Model Perception of Color Illusions in Photorealistic Scenes
by: Mao, Lingjun, et al.
Published: (2024)

Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
by: Liu, Yifan, et al.
Published: (2025)

Dr.Copilot: A Multi-Agent Prompt Optimized Assistant for Improving Patient-Doctor Communication in Romanian
by: Niculae, Andrei, et al.
Published: (2025)

Seeing the Evidence, Missing the Answer: Tool-Guided Vision-Language Models on Visual Illusions
by: Wang, Xuesong, et al.
Published: (2026)

Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models
by: Williams-Lekuona, Mikel, et al.
Published: (2025)

Do Large Vision-Language Models Distinguish between the Actual and Apparent Features of Illusions?
by: Shinozaki, Taiga, et al.
Published: (2025)

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning
by: Liu, Yang, et al.
Published: (2025)

Illusions in Humans and AI: How Visual Perception Aligns and Diverges
by: Yang, Jianyi, et al.
Published: (2025)

HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models
by: Guan, Tianrui, et al.
Published: (2023)

Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation
by: Li, Yuchen, et al.
Published: (2026)

The Spatial Blindspot of Vision-Language Models
by: Alam, Nahid, et al.
Published: (2026)

SpatialBot: Precise Spatial Understanding with Vision Language Models
by: Cai, Wenxiao, et al.
Published: (2024)

SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models
by: Cheng, An-Chieh, et al.
Published: (2024)

From Illusion to Intention: Visual Rationale Learning for Vision-Language Reasoning
by: Wang, Changpeng, et al.
Published: (2025)

Spatial-VLN: Zero-Shot Vision-and-Language Navigation With Explicit Spatial Perception and Exploration
by: Yue, Lu, et al.
Published: (2026)

Same or Not? Enhancing Visual Perception in Vision-Language Models
by: Marsili, Damiano, et al.
Published: (2025)

SpatiO: Adaptive Test-Time Orchestration of Vision-Language Agents for Spatial Reasoning
by: Hwang, Chan Yeong, et al.
Published: (2026)

GS-Bias: Global-Spatial Bias Learner for Single-Image Test-Time Adaptation of Vision-Language Models
by: Huang, Zhaohong, et al.
Published: (2025)

Test-Time Consistency in Vision Language Models
by: Chou, Shih-Han, et al.
Published: (2025)

CS-Mixer: A Cross-Scale Vision MLP Model with Spatial-Channel Mixing
by: Cui, Jonathan, et al.
Published: (2023)

The Illusion of Clinical Reasoning: A Benchmark Reveals the Pervasive Gap in Vision-Language Models for Clinical Competency
by: Wang, Dingyu, et al.
Published: (2025)

Spatial-aware Vision Language Model for Autonomous Driving
by: Wei, Weijie, et al.
Published: (2025)