Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.08571 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917482972839936 |
|---|---|
| author | Zhang, Antong Qi, Han Yang, Heng |
| author_facet | Zhang, Antong Qi, Han Yang, Heng |
| contents | We introduce BEACON--Best-Effort Adaptation for Cross-Domain Co-Training--a theory-driven framework for training generative robot policies with abundant source demonstrations and limited target demonstrations. BEACON casts cross-domain co-training as a discrepancy-aware importance-reweighting problem, jointly learning a diffusion-based visuomotor policy and per-sample source weights that minimize an objective informed by target-domain generalization guarantees. To make best-effort adaptation practical for high-dimensional sequence policies, we develop scalable instance-level discrepancy estimators, stochastic alternating updates for policy and weights, and a multi-source extension that balances heterogeneous source domains. Across sim-to-sim, sim-to-real, and multi-source manipulation settings, BEACON improves robustness and data efficiency over target-only, fixed-ratio co-training, and feature-alignment baselines. Importantly, even without an explicit alignment objective, BEACON achieves feature alignment as an implicit result of discrepancy-aware cross-domain co-training. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2605_08571 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | BEACON: Cross-Domain Co-Training of Generative Robot Policies via Best-Effort Adaptation Zhang, Antong Qi, Han Yang, Heng Robotics We introduce BEACON--Best-Effort Adaptation for Cross-Domain Co-Training--a theory-driven framework for training generative robot policies with abundant source demonstrations and limited target demonstrations. BEACON casts cross-domain co-training as a discrepancy-aware importance-reweighting problem, jointly learning a diffusion-based visuomotor policy and per-sample source weights that minimize an objective informed by target-domain generalization guarantees. To make best-effort adaptation practical for high-dimensional sequence policies, we develop scalable instance-level discrepancy estimators, stochastic alternating updates for policy and weights, and a multi-source extension that balances heterogeneous source domains. Across sim-to-sim, sim-to-real, and multi-source manipulation settings, BEACON improves robustness and data efficiency over target-only, fixed-ratio co-training, and feature-alignment baselines. Importantly, even without an explicit alignment objective, BEACON achieves feature alignment as an implicit result of discrepancy-aware cross-domain co-training. |
| title | BEACON: Cross-Domain Co-Training of Generative Robot Policies via Best-Effort Adaptation |
| topic | Robotics |
| url | https://arxiv.org/abs/2605.08571 |