Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shen, Yutong, Liu, Hangxu, Zhang, Lei, Liu, Penghui, Liu, Yinqi, Yang, Liuxiang, Feng, Tongtong
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2604.20721
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915949529006080
author	Shen, Yutong Liu, Hangxu Zhang, Lei Liu, Penghui Liu, Yinqi Yang, Liuxiang Feng, Tongtong
author_facet	Shen, Yutong Liu, Hangxu Zhang, Lei Liu, Penghui Liu, Yinqi Yang, Liuxiang Feng, Tongtong
contents	Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, ALAS can achieve an average subtasks success rate improvement of 23\% and average execution efficiency improvement of 29\%.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_20721
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement Shen, Yutong Liu, Hangxu Zhang, Lei Liu, Penghui Liu, Yinqi Yang, Liuxiang Feng, Tongtong Robotics Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, ALAS can achieve an average subtasks success rate improvement of 23\% and average execution efficiency improvement of 29\%.
title	ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement
topic	Robotics
url	https://arxiv.org/abs/2604.20721

Similar Items