:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wu, Heran, Zhou, Zirun, Zhang, Jingfeng
Format:	Preprint
Published:	2025
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2508.06547
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mechanistic interpretability for steering vision-language-action models
by: Häon, Bear, et al.
Published: (2025)

FATE-VLA:Failue-aware test generation for vision-language-action models
by: Kanwal, Arusa, et al.
Published: (2026)

MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation
by: Wu, Zhenyu, et al.
Published: (2025)

A vision-language model and platform for temporally mapping surgery from video
by: Kiyasseh, Dani
Published: (2026)

Purely vision-based collective movement of robots
by: Mezey, David, et al.
Published: (2024)

FlySearch: Exploring how vision-language models explore
by: Pardyl, Adam, et al.
Published: (2025)

Foundation models on the bridge: Semantic hazard detection and safety maneuvers for maritime autonomy with vision-language models
by: Christensen, Kim Alexander, et al.
Published: (2025)

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline
by: Song, Wenxuan, et al.
Published: (2026)

The active visual sensing methods for robotic welding: review, tutorial and prospect
by: Wang, ZhenZhou
Published: (2024)

Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model
by: Li, Fuhao, et al.
Published: (2025)

Openfly: A comprehensive platform for aerial vision-language navigation
by: Gao, Yunpeng, et al.
Published: (2025)

Memorized action chunking with Transformers: Imitation learning for vision-based tissue surface scanning
by: Yang, Bochen, et al.
Published: (2024)

Large language model-based task planning for service robots: A review
by: Bian, Shaohan, et al.
Published: (2025)

CottonSim: A vision-guided autonomous robotic system for cotton harvesting in Gazebo simulation
by: Thayananthan, Thevathayarajh, et al.
Published: (2025)

Assist-as-needed Hip Exoskeleton Control for Gait Asymmetry Correction via Human-in-the-loop Optimization
by: Qian, Yuepeng, et al.
Published: (2025)

One to rule them all: natural language to bind communication, perception and action
by: Colombani, Simone, et al.
Published: (2024)

Joint Moment Estimation for Hip Exoskeleton Control: A Generalized Moment Feature Generation Method
by: Zhang, Yuanwen, et al.
Published: (2024)

Concept-Based Dictionary Learning for Inference-Time Safety in Vision Language Action Models
by: Wen, Siqi, et al.
Published: (2026)

Using large language models for embodied planning introduces systematic safety risks
by: Zhang, Tao, et al.
Published: (2026)

Robots that learn to evaluate models of collective behavior
by: Hocke, Mathis, et al.
Published: (2026)

A vision-based robotic system for precision pollination of apples
by: Bhattarai, Uddhav, et al.
Published: (2024)

Training microrobots to swim by a large language model
by: Xu, Zhuoqun, et al.
Published: (2024)

Value-guided action planning with JEPA world models
by: Destrade, Matthieu, et al.
Published: (2025)

YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
by: Impraimakis, Marios, et al.
Published: (2026)

FlightBench: Benchmarking Learning-based Methods for Ego-vision-based Quadrotors Navigation
by: Yu, Shu-Ang, et al.
Published: (2024)

Sparsh: Self-supervised touch representations for vision-based tactile sensing
by: Higuera, Carolina, et al.
Published: (2024)

A transparency-based action model implemented in a robotic physical trainer for improved HRI
by: Naama, Aharony, et al.
Published: (2024)

The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via Differentiable Token Pruning
by: Jiang, Titong, et al.
Published: (2025)

Ontological grounding for sound and natural robot explanations via large language models
by: Olivares-Alarcos, Alberto, et al.
Published: (2026)

Two-stream network-driven vision-based tactile sensor for object feature extraction and fusion perception
by: Huang, Muxing, et al.
Published: (2025)

A physics-based sensor simulation environment for lunar ground operations
by: Batagoda, Nevindu M., et al.
Published: (2024)

A SysML-based language for evaluating the integrity of simulation and physical embodiments of Cyber-Physical systems
by: Dudek, Wojciech, et al.
Published: (2023)

PROSKILL: A formal skill language for acting in robotics
by: Ingrand, Félix
Published: (2024)

Biomechanically consistent real-time action recognition for human-robot interaction
by: Li, Wanchen, et al.
Published: (2025)

A physics-informed, vision-based method to reconstruct all deformation modes in slender bodies
by: Kim, Seung Hyun, et al.
Published: (2021)

Collision avoidance from monocular vision trained with novel view synthesis
by: Tordjman--Levavasseur, Valentin, et al.
Published: (2025)

CLUE: Crossmodal disambiguation via Language-vision Understanding with attEntion
by: Abrini, Mouad, et al.
Published: (2026)

Traversability analysis with vision and terrain probing for safe legged robot navigation
by: Haddeler, Garen, et al.
Published: (2022)

Bio-inspired reconfigurable stereo vision for robotics using omnidirectional cameras
by: Chen, Suchang, et al.
Published: (2024)

AIM: Intent-Aware Unified world action Modeling with Spatial Value Maps
by: Fan, Liaoyuan, et al.
Published: (2026)