NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
2022 Spring Internship Exit PresentationAs efforts of the National Aeronautics and Space Administration (NASA) and the Federal Aviation Administration (FAA) continue to digitize the air traffic management (ATM) domain, there is countless times of need for downstream natural language processing (NLP) tasks such as named entity recognition, text summarization, classification, and more. Although there are a plethora of open-sourced pre-trained transformer models in the NLP field such as BERT, RoBERTa, XLNet, and GPT-3, these models are trained on general corpora and perform poorly on domain-specific terminology and phraseology seen in ATM documents such as Notice to Airmen (NOTAMs) and Letters of Agreement (LoA). Our proposed research objective will be to first gather a large corpus of air traffic management related documents, orders, notices, books, technical papers, conference papers, articles, and other miscellaneous sources of text data from the FAA, NASA, and accredited conference and publication societies. After gathering this data, many steps will have to be taken to collate and preprocess the data into a format understandable by our test transformer models. Thirdly, we will set up training pipelines to train the RoBERTa model on its unsupervised training task masked language modelling (MLM) using resources provided by the NASA Advanced Supercomputing (NAS) facilities. Finally, these fine-tuned transformer models will be evaluated on their performance on down-stream NLP tasks as mentioned above, to show whether they will be effective when working with ATM related data or not. Once complete, this model could be made open-sourced on the HuggingFace website, where the rest of the ATM community can access and utilize this tool.
Document ID
20220008130
Acquisition Source
Ames Research Center
Document Type
Presentation
Authors
Olivia He
(Universities Space Research Association Columbia, Maryland, United States)
Shreya Anand
(Universities Space Research Association Columbia, Maryland, United States)
Date Acquired
May 24, 2022
Subject Category
Aeronautics (General)
Documentation And Information Science
Meeting Information
Meeting: Intern Exit Presentation
Location: Moffett Field, CA
Country: US
Start Date: May 12, 2022
Sponsors: Ames Research Center
Funding Number(s)
WBS: 090265
CONTRACT_GRANT: NNX13AJ38A
Distribution Limits
Public
Copyright
Public Use Permitted.
Technical Review
NASA Technical Management
Keywords
NLP
ATM
AI/ML
AI
ML
Machine Learning
Artificial Intelligence
Natural Language Processing
BERT
No Preview Available