Saved in:
Bibliographic Details
Main Authors: Li, Wenbin, Wu, Jingling, Chen, Xiaoyong Lin. Jing, Chen, Cong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.09105
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911378591186944
author Li, Wenbin
Wu, Jingling
Chen, Xiaoyong Lin. Jing
Chen, Cong
author_facet Li, Wenbin
Wu, Jingling
Chen, Xiaoyong Lin. Jing
Chen, Cong
contents Civil aviation is a cornerstone of global transportation and commerce, and ensuring its safety, efficiency and customer satisfaction is paramount. Yet conventional Artificial Intelligence (AI) solutions in aviation remain siloed and narrow, focusing on isolated tasks or single modalities. They struggle to integrate heterogeneous data such as voice communications, radar tracks, sensor streams and textual reports, which limits situational awareness, adaptability, and real-time decision support. This paper introduces the vision of AviationLMM, a Large Multimodal foundation Model for civil aviation, designed to unify the heterogeneous data streams of civil aviation and enable understanding, reasoning, generation and agentic applications. We firstly identify the gaps between existing AI solutions and requirements. Secondly, we describe the model architecture that ingests multimodal inputs such as air-ground voice, surveillance, on-board telemetry, video and structured texts, and performs cross-modal alignment and fusion, and produces flexible outputs ranging from situation summaries and risk alerts to predictive diagnostics and multimodal incident reconstructions. In order to fully realize this vision, we identify key research opportunities to address, including data acquisition, alignment and fusion, pretraining, reasoning, trustworthiness, privacy, robustness to missing modalities, and synthetic scenario generation. By articulating the design and challenges of AviationLMM, we aim to boost the civil aviation foundation model progress and catalyze coordinated research efforts toward an integrated, trustworthy and privacy-preserving aviation AI ecosystem.
format Preprint
id arxiv_https___arxiv_org_abs_2601_09105
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle AviationLMM: A Large Multimodal Foundation Model for Civil Aviation
Li, Wenbin
Wu, Jingling
Chen, Xiaoyong Lin. Jing
Chen, Cong
Artificial Intelligence
Computation and Language
Computer Vision and Pattern Recognition
Civil aviation is a cornerstone of global transportation and commerce, and ensuring its safety, efficiency and customer satisfaction is paramount. Yet conventional Artificial Intelligence (AI) solutions in aviation remain siloed and narrow, focusing on isolated tasks or single modalities. They struggle to integrate heterogeneous data such as voice communications, radar tracks, sensor streams and textual reports, which limits situational awareness, adaptability, and real-time decision support. This paper introduces the vision of AviationLMM, a Large Multimodal foundation Model for civil aviation, designed to unify the heterogeneous data streams of civil aviation and enable understanding, reasoning, generation and agentic applications. We firstly identify the gaps between existing AI solutions and requirements. Secondly, we describe the model architecture that ingests multimodal inputs such as air-ground voice, surveillance, on-board telemetry, video and structured texts, and performs cross-modal alignment and fusion, and produces flexible outputs ranging from situation summaries and risk alerts to predictive diagnostics and multimodal incident reconstructions. In order to fully realize this vision, we identify key research opportunities to address, including data acquisition, alignment and fusion, pretraining, reasoning, trustworthiness, privacy, robustness to missing modalities, and synthetic scenario generation. By articulating the design and challenges of AviationLMM, we aim to boost the civil aviation foundation model progress and catalyze coordinated research efforts toward an integrated, trustworthy and privacy-preserving aviation AI ecosystem.
title AviationLMM: A Large Multimodal Foundation Model for Civil Aviation
topic Artificial Intelligence
Computation and Language
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2601.09105