Saved in:
| Main Authors: | Zhang, Jiahui, Chen, Yurui, Zhou, Yanpeng, Xu, Yueming, Huang, Ze, Mei, Jilin, Chen, Junhui, Yuan, Yu-Jie, Cai, Xinyue, Huang, Guowei, Quan, Xingyue, Xu, Hang, Zhang, Li |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.22976 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding
by: Xu, Yueming, et al.
Published: (2025)
by: Xu, Yueming, et al.
Published: (2025)
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
by: Huang, Helong, et al.
Published: (2025)
by: Huang, Helong, et al.
Published: (2025)
Whole-Body Inverse Kinematics with Graph Diffusion
by: Huang, Helong, et al.
Published: (2026)
by: Huang, Helong, et al.
Published: (2026)
RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
by: Nie, Yunshuang, et al.
Published: (2026)
by: Nie, Yunshuang, et al.
Published: (2026)
Beyond Flatlands: Unlocking Spatial Intelligence by Decoupling 3D Reasoning from Numerical Regression
by: Guo, Zhongbin, et al.
Published: (2025)
by: Guo, Zhongbin, et al.
Published: (2025)
A Computer Vision Problem in Flatland
by: Agarwal, Sameer, et al.
Published: (2025)
by: Agarwal, Sameer, et al.
Published: (2025)
Self-CriTeach: LLM Self-Teaching and Self-Critiquing for Improving Robotic Planning via Automated Domain Generation
by: Huang, Jinbang, et al.
Published: (2025)
by: Huang, Jinbang, et al.
Published: (2025)
Scale‐Invariant Waveguiding in Flatland
by: Zhixia Xu, et al.
Published: (2026)
by: Zhixia Xu, et al.
Published: (2026)
Circular Isoptics in Flatland
by: Thomas, Alexander
Published: (2025)
by: Thomas, Alexander
Published: (2025)
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
by: Nie, Ming, et al.
Published: (2023)
by: Nie, Ming, et al.
Published: (2023)
LaneCorrect: Self-supervised Lane Detection
by: Nie, Ming, et al.
Published: (2024)
by: Nie, Ming, et al.
Published: (2024)
Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
by: Zhang, Gengyuan, et al.
Published: (2023)
by: Zhang, Gengyuan, et al.
Published: (2023)
ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
by: Zhang, Shuoheng, et al.
Published: (2026)
by: Zhang, Shuoheng, et al.
Published: (2026)
Ru Single Atoms Integrated into Cobalt Oxide Spinel Structure with Interstitial Carbon for Enhanced Electrocatalytic Water Oxidation
by: Guowei Wang, et al.
Published: (2024)
by: Guowei Wang, et al.
Published: (2024)
Observation of Analog Flatland Cherenkov Radiations on Metasurfaces (Laser Photonics Rev. 18(2)/2024)
by: Zhixia Xu, et al.
Published: (2024)
by: Zhixia Xu, et al.
Published: (2024)
SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
by: Liu, Yuecheng, et al.
Published: (2025)
by: Liu, Yuecheng, et al.
Published: (2025)
Prompting CO 2 Electroreduction to Ethanol by Iron Group Metal Ion Dopants Induced Multi‐sites at the Interface of SnSe/SnSe 2 p–n Heterojunction
by: Xinyue Zheng, et al.
Published: (2024)
by: Xinyue Zheng, et al.
Published: (2024)
ANCoEF: Asynchronous Neuromorphic Algorithm/Hardware Co-Exploration Framework with a Fully Asynchronous Simulator
by: Zhang, Jian, et al.
Published: (2024)
by: Zhang, Jian, et al.
Published: (2024)
Frontispiece: Scale‐Invariant Waveguiding in Flatland (EXP2 1/2026)
by: Zhixia Xu, et al.
Published: (2026)
by: Zhixia Xu, et al.
Published: (2026)
TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?
by: Yan, Lewen, et al.
Published: (2025)
by: Yan, Lewen, et al.
Published: (2025)
Copy-Move Forgery Detection and Question Answering for Remote Sensing Image
by: Zhang, Ze, et al.
Published: (2024)
by: Zhang, Ze, et al.
Published: (2024)
DeflareMamba: Hierarchical Vision Mamba for Contextually Consistent Lens Flare Removal
by: Huang, Yihang, et al.
Published: (2025)
by: Huang, Yihang, et al.
Published: (2025)
Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain
by: Zhang, Jiaquan, et al.
Published: (2026)
by: Zhang, Jiaquan, et al.
Published: (2026)
Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
by: Lou, Xinyue, et al.
Published: (2025)
by: Lou, Xinyue, et al.
Published: (2025)
Reinforcing Action Policies by Prophesying
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
by: Lu, Jiachen, et al.
Published: (2023)
by: Lu, Jiachen, et al.
Published: (2023)
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
by: Jiang, Haochen, et al.
Published: (2024)
by: Jiang, Haochen, et al.
Published: (2024)
Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation
by: Zhang, Lingfeng, et al.
Published: (2025)
by: Zhang, Lingfeng, et al.
Published: (2025)
Case Law Grounding: Using Precedents to Align Decision-Making for Humans and AI
by: Chen, Quan Ze, et al.
Published: (2023)
by: Chen, Quan Ze, et al.
Published: (2023)
UNIT: Unifying Image and Text Recognition in One Vision Encoder
by: Zhu, Yi, et al.
Published: (2024)
by: Zhu, Yi, et al.
Published: (2024)
Enhanced DSP Architecture for Small Floating‐Point Based Deep Learning Accelerators on FPGAs
by: Kuiming Ma, et al.
Published: (2026)
by: Kuiming Ma, et al.
Published: (2026)
The Relationships Between Contingent Reward Leadership, Perceived Organizational Support, Knowledge‐Sharing Intention, Innovative Culture, and Kindergarten Teachers' Creative Teaching Performance: A Mediated Moderation Model
by: Maoyong Huang, et al.
Published: (2025)
by: Maoyong Huang, et al.
Published: (2025)
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
by: Cai, Xinyan, et al.
Published: (2025)
by: Cai, Xinyan, et al.
Published: (2025)
Purely Quadratic Non-Gaussianity from Tachyonic Instability: Primordial Black Holes and Scalar-Induced Gravitational Waves
by: Zhang, He-Xu, et al.
Published: (2026)
by: Zhang, He-Xu, et al.
Published: (2026)
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models
by: Zhu, Jie, et al.
Published: (2025)
by: Zhu, Jie, et al.
Published: (2025)
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents
by: Chen, Jun, et al.
Published: (2024)
by: Chen, Jun, et al.
Published: (2024)
A Multimodal Fusion Framework for Early Non‐Invasive Screening of Cognitive Impairment Using Language Digital Biomarkers
by: Jiahui Xu, et al.
Published: (2025)
by: Jiahui Xu, et al.
Published: (2025)
Theoretical Insights into Line Graph Transformation on Graph Learning
by: Yang, Fan, et al.
Published: (2024)
by: Yang, Fan, et al.
Published: (2024)
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning
by: Xiang, Kun, et al.
Published: (2025)
by: Xiang, Kun, et al.
Published: (2025)
Similar Items
-
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
by: Zhang, Jiahui, et al.
Published: (2025) -
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding
by: Xu, Yueming, et al.
Published: (2025) -
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
by: Huang, Helong, et al.
Published: (2025) -
Whole-Body Inverse Kinematics with Graph Diffusion
by: Huang, Helong, et al.
Published: (2026) -
RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
by: Nie, Yunshuang, et al.
Published: (2026)