Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rojas-Ordoñez, Sebastian, Segura Abarrategi, Mikel, Yarza, Irune, Zulueta, Ekaitz
Format:	Recurso digital
Language:	English
Published:	Zenodo 2026
Subjects:	knowledge extraction mobile robotics edge AI Large Language Models robot navigation prompt engineering developer accessibility
Online Access:	https://doi.org/10.3390/make8020049
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

<p>Large Language Models are increasingly used for high-level robotic reasoning, yet their latency and stochasticity complicate their direct use in low-level control. Moreover, extracting actionable navigation cues from multimodal context incurs inference costs that are challenging for embedded platforms. We present a plug-and-play framework that augments a finite-state machine with asynchronous velocity suggestions generated by a Large Language Model, using an off-the-shelf DistilGPT-2 model running on-device on a Jetson AGX Orin. The system extracts task-relevant cues from the current context and integrates them only if they satisfy deadline, schema, and kinematic validation, thereby preserving a deterministic 50 Hz control loop with a <5 ms fallback path. We compare multiple Large Language Models for embedded robot control and quantify trade-offs among model size, inference time, and output validity. To assess whether the Large Language Models add value beyond signal processing, we include an ablation against a standard smoothing baseline; the results indicate that the Large Language Models contribute anticipatory, context-dependent adjustments that are not captured by filtering alone. Experiments in Gazebo and on a real TurtleBot3 reduce the final position error from 0.246 m to 0.159 m and improve trajectory efficiency from 0.821 to 0.901 without increasing control-loop latency. Approximately 80% of the Large Language Models’ outputs pass validation and are applied. Overall, the framework reduces developer effort by enabling behavioral changes at the prompt level while maintaining interpretable, robust edge-based navigation.</p>

Similar Items