Saved in:
Bibliographic Details
Main Authors: Liang, Chaohua, Matsushima, Jun
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.14607
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This letter proposes a physics-aware multi-modal contrastive learning framework designed to transform complex seismic wavefields into human-readable physical representations. Traditional data-driven inversion methods often focus on pixel-wise mapping, which lacks physical grounding and interpretability. To address this, we introduce a novel framework that jointly aligns seismic shot gathers, subsurface velocity models, and explicit physical descriptors (e.g., mean velocity and gradients) in a shared latent space. By introducing these descriptors as a third modality, our approach encourages the learned embeddings to capture intrinsic geological semantics rather than superficial signal correlations. Experiments on the OpenFWI dataset demonstrate that the proposed method not only achieves robust seismic-to-velocity retrieval but also preserves meaningful physical semantics, enabling cross-modal inference of interpretable attributes. This representation-centric perspective provides a flexible foundation for expert-guided subsurface characterization.