Saved in:
Bibliographic Details
Main Authors: Jia, Zhe, Zhang, Xiaotian, Li, Junpeng
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.06320
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909018935525376
author Jia, Zhe
Zhang, Xiaotian
Li, Junpeng
author_facet Jia, Zhe
Zhang, Xiaotian
Li, Junpeng
contents Inferring high-dimensional physical states from sparse, ad-hoc sensor arrays is a fundamental challenge across AI for Science and industrial IoT. Standard machine learning architectures struggle in these domains due to irregular, variable-cardinality sensor geometries and the profound sim-to-real distribution shift caused by unmodeled physical heterogeneities. To address these challenges, we propose Sensoformer, a set-attention framework integrated with Physics-Structured Domain Randomization (PSDR). By explicitly randomizing the underlying physical dynamics (e.g., propagation media, extreme noise, and network availability dropout) rather than just visual features, PSDR enforces the learning of domain-invariant physical operators. Using seismic source inversion as a rigorous real-world testbed, Sensoformer is pre-trained on 100,000 synthetics and evaluated on a highly complex real-world catalog. We demonstrate that Sensoformer achieves state-of-the-art precision and outperforms Message Passing Neural Networks (MPNNs) and Neural Operators (e.g., DeepONet) which struggle with extreme spatial sparsity and mixed-modality inputs. Furthermore, interpretability analysis reveals that the attention mechanism autonomously discovers optimal experimental design principles, dynamically prioritizing sparse orthogonal sensors to overcome information bottlenecks.
format Preprint
id arxiv_https___arxiv_org_abs_2601_06320
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Sensoformer: Robust Sim-to-Real Inference on Variable-Geometry Sensor Sets via Physics-Structured Randomization
Jia, Zhe
Zhang, Xiaotian
Li, Junpeng
Machine Learning
Geophysics
Inferring high-dimensional physical states from sparse, ad-hoc sensor arrays is a fundamental challenge across AI for Science and industrial IoT. Standard machine learning architectures struggle in these domains due to irregular, variable-cardinality sensor geometries and the profound sim-to-real distribution shift caused by unmodeled physical heterogeneities. To address these challenges, we propose Sensoformer, a set-attention framework integrated with Physics-Structured Domain Randomization (PSDR). By explicitly randomizing the underlying physical dynamics (e.g., propagation media, extreme noise, and network availability dropout) rather than just visual features, PSDR enforces the learning of domain-invariant physical operators. Using seismic source inversion as a rigorous real-world testbed, Sensoformer is pre-trained on 100,000 synthetics and evaluated on a highly complex real-world catalog. We demonstrate that Sensoformer achieves state-of-the-art precision and outperforms Message Passing Neural Networks (MPNNs) and Neural Operators (e.g., DeepONet) which struggle with extreme spatial sparsity and mixed-modality inputs. Furthermore, interpretability analysis reveals that the attention mechanism autonomously discovers optimal experimental design principles, dynamically prioritizing sparse orthogonal sensors to overcome information bottlenecks.
title Sensoformer: Robust Sim-to-Real Inference on Variable-Geometry Sensor Sets via Physics-Structured Randomization
topic Machine Learning
Geophysics
url https://arxiv.org/abs/2601.06320