NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Aviation-Specific Large Language Model Fine-Tuning and LLM-as-a-Judge EvaluationThis paper presents a scalable approach to improve domain-specific understanding by developing an aviation-focused large language model (LLM). While recent LLM advancements, like supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF),have improved general conversational abilities, these methods are costly and difficult to scale in aviation due to limited labeled data. To address these challenges, we applied a self-supervised fine-tuning approach to provide responses to aviation-related query generation without requiring human-labeled question-answer pairs. During fine-tuning, we incorporated the LLM-as-a-judge framework to automatically identify the optimal training parameter combinations. Additionally, to evaluate model performance, we leveraged the same LLM-as-a-judge framework by utilizing a larger LLM to assess the responses generated by both the base and fine-tuned models on aviation-specific questions. This methodology provides a scalable, automated, and high-quality solution for domain-specific language modeling in aviation.
Document ID
20250005989
Acquisition Source
Ames Research Center
Document Type
Conference Paper
Authors
Kathleen Ge
(Ames Research Center Mountain View, United States)
William J Coupe
(Ames Research Center Mountain View, United States)
Date Acquired
June 6, 2025
Subject Category
Air Transportation and Safety
Meeting Information
Meeting: AIAA AVIATION Forum
Location: Las Vegas, NV
Country: US
Start Date: July 21, 2025
End Date: July 25, 2025
Sponsors: American Institute of Aeronautics and Astronautics
Funding Number(s)
PROJECT: 629660
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
Technical Review
NASA Technical Management
Keywords
Self-Supervised Learning
LLM-as-a-Judge
Large Language Model
No Preview Available