Saved in:
| Main Authors: | Xu, Yueming, Zhang, Jiahui, Huang, Ze, Chen, Yurui, Zhou, Yanpeng, Chen, Zhenyu, Yuan, Yu-Jie, Xia, Pengxiang, Huang, Guowei, Cai, Xinyue, Qi, Zhongang, Quan, Xingyue, Hao, Jianye, Xu, Hang, Zhang, Li |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.11952 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
UGG: Unified Generative Grasping
by: Lu, Jiaxin, et al.
Published: (2023)
by: Lu, Jiaxin, et al.
Published: (2023)
Whole-Body Inverse Kinematics with Graph Diffusion
by: Huang, Helong, et al.
Published: (2026)
by: Huang, Helong, et al.
Published: (2026)
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
by: Huang, Helong, et al.
Published: (2025)
by: Huang, Helong, et al.
Published: (2025)
ForceFlow: Learning to Feel and Act via Contact-Driven Flow Matching
by: Zhang, Shuoheng, et al.
Published: (2026)
by: Zhang, Shuoheng, et al.
Published: (2026)
UniStitch: Unifying Semantic and Geometric Features for Image Stitching
by: Mei, Yuan, et al.
Published: (2026)
by: Mei, Yuan, et al.
Published: (2026)
UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting
by: Li, Haoyuan, et al.
Published: (2025)
by: Li, Haoyuan, et al.
Published: (2025)
RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
by: Nie, Yunshuang, et al.
Published: (2026)
by: Nie, Yunshuang, et al.
Published: (2026)
UniChange: Unifying Change Detection with Multimodal Large Language Model
by: Zhang, Xu, et al.
Published: (2025)
by: Zhang, Xu, et al.
Published: (2025)
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
by: Liu, Ye, et al.
Published: (2025)
by: Liu, Ye, et al.
Published: (2025)
UniDex: Rethinking Search Inverted Indexing with Unified Semantic Modeling
by: Li, Zan, et al.
Published: (2025)
by: Li, Zan, et al.
Published: (2025)
EVLP:Learning Unified Embodied Vision-Language Planner with Reinforced Supervised Fine-Tuning
by: Cai, Xinyan, et al.
Published: (2025)
by: Cai, Xinyan, et al.
Published: (2025)
UniMesh: Unifying 3D Mesh Understanding and Generation
by: Huang, Peng, et al.
Published: (2026)
by: Huang, Peng, et al.
Published: (2026)
UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
by: Huang, Qing, et al.
Published: (2025)
by: Huang, Qing, et al.
Published: (2025)
UniCal: Unified Neural Sensor Calibration
by: Yang, Ze, et al.
Published: (2024)
by: Yang, Ze, et al.
Published: (2024)
UniHash: Unifying Pointwise and Pairwise Hashing Paradigms
by: Ma, Xiaoxu, et al.
Published: (2026)
by: Ma, Xiaoxu, et al.
Published: (2026)
Confidence Contours: Uncertainty-Aware Annotation for Medical Semantic Segmentation
by: Ye, Andre, et al.
Published: (2023)
by: Ye, Andre, et al.
Published: (2023)
UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding
by: Jiao, Yang, et al.
Published: (2025)
by: Jiao, Yang, et al.
Published: (2025)
UniCom: Unified Multimodal Modeling via Compressed Continuous Semantic Representations
by: Zhao, Yaqi, et al.
Published: (2026)
by: Zhao, Yaqi, et al.
Published: (2026)
SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
by: Liu, Yuecheng, et al.
Published: (2025)
by: Liu, Yuecheng, et al.
Published: (2025)
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models
by: Jiang, Hong, et al.
Published: (2026)
by: Jiang, Hong, et al.
Published: (2026)
UniHGKR: Unified Instruction-aware Heterogeneous Knowledge Retrievers
by: Min, Dehai, et al.
Published: (2024)
by: Min, Dehai, et al.
Published: (2024)
Uni-Animator: Towards Unified Visual Colorization
by: Chen, Xinyuan, et al.
Published: (2026)
by: Chen, Xinyuan, et al.
Published: (2026)
UniVBench: Towards Unified Evaluation for Video Foundation Models
by: Wei, Jianhui, et al.
Published: (2026)
by: Wei, Jianhui, et al.
Published: (2026)
SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation
by: Chen, Zisheng, et al.
Published: (2025)
by: Chen, Zisheng, et al.
Published: (2025)
Ru Single Atoms Integrated into Cobalt Oxide Spinel Structure with Interstitial Carbon for Enhanced Electrocatalytic Water Oxidation
by: Guowei Wang, et al.
Published: (2024)
by: Guowei Wang, et al.
Published: (2024)
UniRec: Unified Multimodal Encoding for LLM-Based Recommendations
by: Lei, Zijie, et al.
Published: (2026)
by: Lei, Zijie, et al.
Published: (2026)
Geometric Phase-Driven Scattering Evolutions
by: Wang, Pengxiang, et al.
Published: (2024)
by: Wang, Pengxiang, et al.
Published: (2024)
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement
by: Peng, Renyuan, et al.
Published: (2024)
by: Peng, Renyuan, et al.
Published: (2024)
UniADC: A Unified Framework for Anomaly Detection and Classification
by: Zhang, Ximiao, et al.
Published: (2025)
by: Zhang, Ximiao, et al.
Published: (2025)
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE
by: Liu, Zhenyu, et al.
Published: (2025)
by: Liu, Zhenyu, et al.
Published: (2025)
Prompting CO 2 Electroreduction to Ethanol by Iron Group Metal Ion Dopants Induced Multi‐sites at the Interface of SnSe/SnSe 2 p–n Heterojunction
by: Xinyue Zheng, et al.
Published: (2024)
by: Xinyue Zheng, et al.
Published: (2024)
TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning
by: Xu, Junzhe, et al.
Published: (2025)
by: Xu, Junzhe, et al.
Published: (2025)
Extending General Covariance: A Unified Framework for Electromagnetic Interactions and Geometric Unification
by: Xiulin Huang, et al.
Published: (2025)
by: Xiulin Huang, et al.
Published: (2025)
UniAPO: Unified Multimodal Automated Prompt Optimization
by: Zhu, Qipeng, et al.
Published: (2025)
by: Zhu, Qipeng, et al.
Published: (2025)
UniSearch: Rethinking Search System with a Unified Generative Architecture
by: Chen, Jiahui, et al.
Published: (2025)
by: Chen, Jiahui, et al.
Published: (2025)
UniSurgSAM: A Unified Promptable Model for Reliable Surgical Video Segmentation
by: Liu, Haofeng, et al.
Published: (2026)
by: Liu, Haofeng, et al.
Published: (2026)
UniCompress: Token Compression for Unified Vision-Language Understanding and Generation
by: Wang, Ziyao, et al.
Published: (2026)
by: Wang, Ziyao, et al.
Published: (2026)
AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning
by: Chen, Weixing, et al.
Published: (2025)
by: Chen, Weixing, et al.
Published: (2025)
Similar Items
-
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
by: Zhang, Jiahui, et al.
Published: (2025) -
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
by: Zhang, Jiahui, et al.
Published: (2025) -
UGG: Unified Generative Grasping
by: Lu, Jiaxin, et al.
Published: (2023) -
Whole-Body Inverse Kinematics with Graph Diffusion
by: Huang, Helong, et al.
Published: (2026) -
GraphCoT-VLA: A 3D Spatial-Aware Reasoning Vision-Language-Action Model for Robotic Manipulation with Ambiguous Instructions
by: Huang, Helong, et al.
Published: (2025)