:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chen, Yiye, Jian, Yanan, Dong, Xiaoyi, Cao, Shuxin, Wu, Jing, Vela, Patricio, Lundell, Benjamin E., Chen, Dongdong
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Robotics
Online-Zugang:	https://arxiv.org/abs/2602.05049
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System
von: Chen, Yiye, et al.
Veröffentlicht: (2025)

Good Weights: Proactive, Adaptive Dead Reckoning Fusion for Continuous and Robust Visual SLAM
von: Du, Yanwei, et al.
Veröffentlicht: (2025)

VISTA: Generative Visual Imagination for Vision-and-Language Navigation
von: Huang, Yanjia, et al.
Veröffentlicht: (2025)

Efficient Iterative Proximal Variational Inference Motion Planning
von: Chang, Zinuo, et al.
Veröffentlicht: (2024)

GeomPrompt: Geometric Prompt Learning for RGB-D Semantic Segmentation Under Missing and Degraded Depth
von: Jaganathan, Krishna, et al.
Veröffentlicht: (2026)

VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation
von: Zhang, Chaofan, et al.
Veröffentlicht: (2025)

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models
von: Wang, Hao, et al.
Veröffentlicht: (2026)

CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models
von: Glossop, Catherine, et al.
Veröffentlicht: (2025)

QuadPiPS: A Perception-informed Footstep Planner for Quadrupeds With Semantic Affordance Prediction
von: Asselmeier, Max, et al.
Veröffentlicht: (2024)

Factor Graph-Based Shape Estimation for Continuum Robots via Magnus Expansion
von: Ticozzi, Lorenzo, et al.
Veröffentlicht: (2026)

VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models
von: Wang, Zixuan, et al.
Veröffentlicht: (2026)

DyGRO-VLA: Cross-Task Scaling of Vision-Language-Action Models via Dynamic Grouped Residual Optimization
von: Lin, Sixu, et al.
Veröffentlicht: (2026)

Reshaping Action Error Distributions for Reliable Vision-Language-Action Models
von: Bai, Shuanghao, et al.
Veröffentlicht: (2026)

HALO: A Unified Vision-Language-Action Model for Embodied Multimodal Chain-of-Thought Reasoning
von: Shou, Quanxin, et al.
Veröffentlicht: (2026)

DDGC: Generative Deep Dexterous Grasping in Clutter
von: Lundell, Jens, et al.
Veröffentlicht: (2021)

villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
von: Chen, Xiaoyu, et al.
Veröffentlicht: (2025)

BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation
von: Hu, Yucheng, et al.
Veröffentlicht: (2026)

VLMimic: Vision Language Models are Visual Imitation Learner for Fine-grained Actions
von: Chen, Guanyan, et al.
Veröffentlicht: (2024)

Point Tracking Improves World Action Models
von: Guan, Jiarui, et al.
Veröffentlicht: (2026)

Task-driven SLAM Benchmarking For Robot Navigation
von: Du, Yanwei, et al.
Veröffentlicht: (2024)

A Vision-Language-Action Model with Visual Prompt for OFF-Road Autonomous Driving
von: Zhang, Liangdong, et al.
Veröffentlicht: (2026)

Pushing Everything Everywhere All At Once: Probabilistic Prehensile Pushing
von: Perugini, Patrizio, et al.
Veröffentlicht: (2025)

CAPGrasp: An $\mathbb{R}^3\times \text{SO(2)-equivariant}$ Continuous Approach-Constrained Generative Grasp Sampler
von: Weng, Zehang, et al.
Veröffentlicht: (2023)

Preference-Conditioned Multi-Objective RL for Integrated Command Tracking and Force Compliance in Humanoid Locomotion
von: Leng, Tingxuan, et al.
Veröffentlicht: (2025)

OmniPose6D: Towards Short-Term Object Pose Tracking in Dynamic Scenes from Monocular RGB
von: Lin, Yunzhi, et al.
Veröffentlicht: (2024)

StereoVLA: Enhancing Vision-Language-Action Models with Stereo Vision
von: Deng, Shengliang, et al.
Veröffentlicht: (2025)

Safe Gap-based Planning in Dynamic Settings
von: Asselmeier, Max, et al.
Veröffentlicht: (2025)

Hierarchical Experience-informed Navigation for Multi-modal Quadrupedal Rebar Grid Traversal
von: Asselmeier, Max, et al.
Veröffentlicht: (2023)

AT-VLA: Adaptive Tactile Injection for Enhanced Feedback Reaction in Vision-Language-Action Models
von: Li, Xiaoqi, et al.
Veröffentlicht: (2026)

FlowVLA: Visual Chain of Thought-based Motion Reasoning for Vision-Language-Action Models
von: Zhong, Zhide, et al.
Veröffentlicht: (2025)

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
von: Zhong, Yifan, et al.
Veröffentlicht: (2025)

MMDVS-LF: Multi-Modal Dynamic Vision Sensor and Eye-Tracking Dataset for Line Following
von: Resch, Felix, et al.
Veröffentlicht: (2024)

LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment
von: Nie, Dujun, et al.
Veröffentlicht: (2026)

Embodied Learning of Reward for Musculoskeletal Control with Vision Language Models
von: Soedarmadji, Saraswati, et al.
Veröffentlicht: (2025)

Health-Conditioned Vision-Language-Action Models for Malfunction-Aware Robot Control
von: Arslan, Hüseyin, et al.
Veröffentlicht: (2026)

Human-assisted Robotic Policy Refinement via Action Preference Optimization
von: Xia, Wenke, et al.
Veröffentlicht: (2025)

UAV-Track VLA: Embodied Aerial Tracking via Vision-Language-Action Models
von: Zhang, Qiyao, et al.
Veröffentlicht: (2026)

FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
von: Marta, Daniel, et al.
Veröffentlicht: (2025)

FocusVLA: Focused Visual Utilization for Vision-Language-Action Models
von: Zhang, Yichi, et al.
Veröffentlicht: (2026)

LaViRA: Language-Vision-Robot Actions Translation for Zero-Shot Vision Language Navigation in Continuous Environments
von: Ding, Hongyu, et al.
Veröffentlicht: (2025)