Date of Award
Fall 2023
Access Type
Thesis - Open Access
Degree Name
Master of Science in Electrical & Computer Engineering
Department
Electrical Engineering and Computer Science
Committee Chair
Jianhua Liu
First Committee Member
Andrew Schneider
Second Committee Member
Prashant Shekhar
College Dean
James Gregory
Abstract
With recent advances in machine learning and deep learning technologies and the creation of larger aviation-specific corpora, applying natural language processing technologies, especially those based on transformer neural networks, to aviation communications is becoming increasingly feasible. Previous work has focused on machine learning applications to natural language processing, such as N-grams and word lattices. This thesis experiments with a process for pretraining transformer-based language models on aviation English corpora and compare the effectiveness and performance of language models transfer learned from pretrained checkpoints and those trained from their base weight initializations (trained from scratch). The results suggest that transformer language models trained from scratch outperform models fine-tuned from pretrained checkpoints. The work concludes by recommending future work to improve pretraining performance and suggestions for downstream, in-domain tasks such as semantic extraction, named entity recognition (callsign identification), speaker role identification, and speech recognition.
Scholarly Commons Citation
Van De Brook, Aaron, "Spoken Language Processing and Modeling for Aviation Communications" (2023). Doctoral Dissertations and Master's Theses. 788.
https://commons.erau.edu/edt/788
Included in
Artificial Intelligence and Robotics Commons, Data Science Commons, Signal Processing Commons, Systems and Communications Commons