Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Tan, Kaizhen
Format: Preprint
Veröffentlicht: 2025
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2509.10522
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866911260674621440
author Tan, Kaizhen
author_facet Tan, Kaizhen
contents Air traffic controllers (ATCOs) issue high-intensity voice commands in dense airspace, where accurate workload modeling is critical for safety and efficiency. This paper proposes a multimodal deep learning framework that integrates structured data, trajectory sequences, and image features to estimate two key parameters in the ATCO command lifecycle: the time offset between a command and the resulting aircraft maneuver, and the command duration. A high-quality dataset was constructed, with maneuver points detected using sliding window and histogram-based methods. A CNN-Transformer ensemble model was developed for accurate, generalizable, and interpretable predictions. By linking trajectories to voice commands, this work offers the first model of its kind to support intelligent command generation and provides practical value for workload assessment, staffing, and scheduling.
format Preprint
id arxiv_https___arxiv_org_abs_2509_10522
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction
Tan, Kaizhen
Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
Audio and Speech Processing
Air traffic controllers (ATCOs) issue high-intensity voice commands in dense airspace, where accurate workload modeling is critical for safety and efficiency. This paper proposes a multimodal deep learning framework that integrates structured data, trajectory sequences, and image features to estimate two key parameters in the ATCO command lifecycle: the time offset between a command and the resulting aircraft maneuver, and the command duration. A high-quality dataset was constructed, with maneuver points detected using sliding window and histogram-based methods. A CNN-Transformer ensemble model was developed for accurate, generalizable, and interpretable predictions. By linking trajectories to voice commands, this work offers the first model of its kind to support intelligent command generation and provides practical value for workload assessment, staffing, and scheduling.
title Multimodal Deep Learning for ATCO Command Lifecycle Modeling and Workload Prediction
topic Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
Audio and Speech Processing
url https://arxiv.org/abs/2509.10522