Towards an Aviation Large Language Model by Fine-tuning and Evaluating Transformers

David Nielsen; Stephen S B Clarke; Krishna M Kalyanam

In the aviation domain, there are many applications for machine learning and artificial intelligence tools that utilize natural language. For example, there is a desire to know the
commonalities in written safety reports such as voluntary post incidents reports or create more accurate transcripts of air traffic management conversations. Another use-case is the possibility of extracting airspace procedures and constraints currently written in documents such as Letters of Agreement (LOA) which is used as the evaluation case in this paper. These applications can benefit from the use of state-of-the-art Natural Language Processing (NLP) techniques when adapted to the language/phraseology specific to the aviation domain. This paper evaluates the viability of transferring pre-trained large language models to the aviation domain by adapting transformer based models using aviation datasets.

This paper utilized two datasets to adapt a ‘Robustly Optimized Bidirectional Encoder Representations from Transformers Approach’ (RoBERTa) model and two down-stream classification tasks to assess its performance. These datasets are all built upon Letters of Agreement which are Federal Aviation Administration (FAA) documents that formalize airspace operations across the national airspace system. The first two datasets are used for the adaptation of RoBERTa to the aviation domain and were of different sizes to assess the number of documents needed to adapt to the aviation domain. They contain many examples of ‘aviation English’ using domain specific terminology and phrasing which serves as a representative basis to perform the unsupervised adaptation. The second dataset is a separate set of LOA documents with two sets of classification labels to be used for evaluation; one at the document level and one at the line level. These down-stream evaluations allowed the measurement of improvement by adapting RoBERTa. The accuracy increased by 4-6% on both tasks and the F1 score on the class of interest increased by 4-8% from the adaptation.

Document ID

20240007390

Acquisition Source

Ames Research Center

Document Type

Conference Paper

Authors

Date Acquired

June 10, 2024

Subject Category

Meeting Information

Meeting: 43rd Digital Avionics Systems Conference (DASC)

Location: San Diego, CA

Country: US

Start Date: September 29, 2024

End Date: October 3, 2024

Sponsors: Institute of Electrical and Electronics Engineers, American Institute of Aeronautics and Astronautics

Funding Number(s)

Distribution Limits

Public

Public Use Permitted.

Technical Review

NASA Technical Management

Keywords

Available Downloads

Name

Type

DASC_2024_Towards_an_Aviation_LLM_Revised.pdf

STI

No Preview Available

NTRS

NTRS - NASA Technical Reports Server

Available Downloads

Related Records