Saved in:
| Main Authors: | Li, Dinging, Zhao, Yingxiu, Cheng, Xinrui, Lin, Kangheng, Peng, Hongbo, Li, Hongxing, Wang, Zixuan, Dai, Yuhong, Li, Haodong, Wang, Jia, Shi, Yukang, Zhao, Liang, Sun, Jianjian, Ge, Zheng, Zhang, Xiangyu, Lu, Weiming, Xiao, Jun, Zhuang, Yueting, Shen, Yongliang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.14144 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models
by: Li, Hongxing, et al.
Published: (2025)
by: Li, Hongxing, et al.
Published: (2025)
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models
by: Li, Dingming, et al.
Published: (2025)
by: Li, Dingming, et al.
Published: (2025)
Milestone-Guided Policy Learning for Long-Horizon Language Agents
by: Wang, Zixuan, et al.
Published: (2026)
by: Wang, Zixuan, et al.
Published: (2026)
GroundAct: Can LLM Agents Ground Actions in Environmental States?
by: Wang, Zixuan, et al.
Published: (2025)
by: Wang, Zixuan, et al.
Published: (2025)
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning
by: Wang, Aozhe, et al.
Published: (2026)
by: Wang, Aozhe, et al.
Published: (2026)
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
by: Zhang, Wenqi, et al.
Published: (2024)
by: Zhang, Wenqi, et al.
Published: (2024)
SpatialFusion: Endowing Unified Image Generation with Intrinsic 3D Geometric Awareness
by: Qiu, Haiyi, et al.
Published: (2026)
by: Qiu, Haiyi, et al.
Published: (2026)
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics
by: Dai, Yuhong, et al.
Published: (2026)
by: Dai, Yuhong, et al.
Published: (2026)
Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts
by: Xu, Haolei, et al.
Published: (2026)
by: Xu, Haolei, et al.
Published: (2026)
Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow
by: Zhang, Wenqi, et al.
Published: (2023)
by: Zhang, Wenqi, et al.
Published: (2023)
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
by: Zhang, Wenqi, et al.
Published: (2025)
by: Zhang, Wenqi, et al.
Published: (2025)
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
by: Zhang, Wenqi, et al.
Published: (2024)
by: Zhang, Wenqi, et al.
Published: (2024)
EvoEmpirBench: Dynamic Spatial Reasoning with Agent-ExpVer
by: Zhao, Pukun, et al.
Published: (2025)
by: Zhao, Pukun, et al.
Published: (2025)
Evo-0: Vision-Language-Action Model with Implicit Spatial Understanding
by: Lin, Tao, et al.
Published: (2025)
by: Lin, Tao, et al.
Published: (2025)
Self-Evolving Spatial Reasoning in Vision Language Models via Geometric Logic Consistency
by: Liu, Junming, et al.
Published: (2026)
by: Liu, Junming, et al.
Published: (2026)
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
by: Tang, Fei, et al.
Published: (2026)
by: Tang, Fei, et al.
Published: (2026)
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations
by: Li, Jia, et al.
Published: (2024)
by: Li, Jia, et al.
Published: (2024)
Automatic Instruction Evolving for Large Language Models
by: Zeng, Weihao, et al.
Published: (2024)
by: Zeng, Weihao, et al.
Published: (2024)
EvoTSE: Evolving Enrollment for Target Speaker Extraction
by: Liu, Zikai, et al.
Published: (2026)
by: Liu, Zikai, et al.
Published: (2026)
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
by: Liu, Yuhong, et al.
Published: (2025)
by: Liu, Yuhong, et al.
Published: (2025)
TaskBench: Benchmarking Large Language Models for Task Automation
by: Shen, Yongliang, et al.
Published: (2023)
by: Shen, Yongliang, et al.
Published: (2023)
Slow Perception: Let's Perceive Geometric Figures Step-by-step
by: Wei, Haoran, et al.
Published: (2024)
by: Wei, Haoran, et al.
Published: (2024)
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
by: Yuan, Yuqian, et al.
Published: (2024)
by: Yuan, Yuqian, et al.
Published: (2024)
EvoWiki: Evaluating LLMs on Evolving Knowledge
by: Tang, Wei, et al.
Published: (2024)
by: Tang, Wei, et al.
Published: (2024)
RAFT-UP: Robust Alignment for Spatial Transcriptomics with Explicit Control of Spatial Distortion
by: Wu, Yaqi, et al.
Published: (2026)
by: Wu, Yaqi, et al.
Published: (2026)
Unhackable Temporal Rewarding for Scalable Video MLLMs
by: Yu, En, et al.
Published: (2025)
by: Yu, En, et al.
Published: (2025)
PerPO: Perceptual Preference Optimization via Discriminative Rewarding
by: Zhu, Zining, et al.
Published: (2025)
by: Zhu, Zining, et al.
Published: (2025)
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering
by: Xu, Haolei, et al.
Published: (2025)
by: Xu, Haolei, et al.
Published: (2025)
Geometrically-Constrained Agent for Spatial Reasoning
by: Chen, Zeren, et al.
Published: (2025)
by: Chen, Zeren, et al.
Published: (2025)
Structural-Temporal Coupling Anomaly Detection with Dynamic Graph Transformer
by: Zong, Chang, et al.
Published: (2025)
by: Zong, Chang, et al.
Published: (2025)
Let LRMs Break Free from Overthinking via Self-Braking Tuning
by: Zhao, Haoran, et al.
Published: (2025)
by: Zhao, Haoran, et al.
Published: (2025)
Mixed‐Mode Fracturing Characteristics of Asphalt Concrete at Low‐Temperature Considering Random Spatial Combinations of Aggregates and Voids
by: Mengzhang Chen, et al.
Published: (2025)
by: Mengzhang Chen, et al.
Published: (2025)
Hierarchical Budget Policy Optimization for Adaptive Reasoning
by: Lyu, Shangke, et al.
Published: (2025)
by: Lyu, Shangke, et al.
Published: (2025)
Neural Network-Assisted RIS Weight Optimization for Spatial Nulling in Distorted Reflector Antenna Systems
by: Li, Xinrui, et al.
Published: (2025)
by: Li, Xinrui, et al.
Published: (2025)
TraceTrans: Translation and Spatial Tracing for Surgical Prediction
by: Luo, Xiyu, et al.
Published: (2025)
by: Luo, Xiyu, et al.
Published: (2025)
Reconstructing 4D Spatial Intelligence: A Survey
by: Cao, Yukang, et al.
Published: (2025)
by: Cao, Yukang, et al.
Published: (2025)
Q-GeoMem: Question-Guided Geometric Memory for Video Spatial Reasoning
by: Gao, Xianqiang, et al.
Published: (2026)
by: Gao, Xianqiang, et al.
Published: (2026)
EvoScene-VLA: Evolving Scene Beliefs Inside the Action Decoder for Chunked Robot Control
by: Zhang, Chushan, et al.
Published: (2026)
by: Zhang, Chushan, et al.
Published: (2026)
Spatial Blindness in Whole-Slide Multiple Instance Learning
by: Li, Xiangyu, et al.
Published: (2026)
by: Li, Xiangyu, et al.
Published: (2026)
EvoCodeBench: A Human-Performance Benchmark for Self-Evolving LLM-Driven Coding Systems
by: Zhang, Wentao, et al.
Published: (2026)
by: Zhang, Wentao, et al.
Published: (2026)
Similar Items
-
SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models
by: Li, Hongxing, et al.
Published: (2025) -
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models
by: Li, Dingming, et al.
Published: (2025) -
Milestone-Guided Policy Learning for Long-Horizon Language Agents
by: Wang, Zixuan, et al.
Published: (2026) -
GroundAct: Can LLM Agents Ground Actions in Environmental States?
by: Wang, Zixuan, et al.
Published: (2025) -
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning
by: Wang, Aozhe, et al.
Published: (2026)