Enabling the Discovery of Recurring Anomalies in Aerospace System Problem Reports using High-Dimensional Clustering TechniquesThis paper describes the results of a significant research and development effort conducted at NASA Ames Research Center to develop new text mining techniques to discover anomalies in free-text reports regarding system health and safety of two aerospace systems. We discuss two problems of significant importance in the aviation industry. The first problem is that of automatic anomaly discovery about an aerospace system through the analysis of tens of thousands of free-text problem reports that are written about the system. The second problem that we address is that of automatic discovery of recurring anomalies, i.e., anomalies that may be described m different ways by different authors, at varying times and under varying conditions, but that are truly about the same part of the system. The intent of recurring anomaly identification is to determine project or system weakness or high-risk issues. The discovery of recurring anomalies is a key goal in building safe, reliable, and cost-effective aerospace systems. We address the anomaly discovery problem on thousands of free-text reports using two strategies: (1) as an unsupervised learning problem where an algorithm takes free-text reports as input and automatically groups them into different bins, where each bin corresponds to a different unknown anomaly category; and (2) as a supervised learning problem where the algorithm classifies the free-text reports into one of a number of known anomaly categories. We then discuss the application of these methods to the problem of discovering recurring anomalies. In fact the special nature of recurring anomalies (very small cluster sizes) requires incorporating new methods and measures to enhance the original approach for anomaly detection. ?& pant 0-
Document ID
20060016353
Acquisition Source
Ames Research Center
Document Type
Conference Paper
Authors
Srivastava, Ashok, N. (NASA Ames Research Center Moffett Field, CA, United States)
Akella, Ram (California Univ. Santa Cruz, CA, United States)
Diev, Vesselin (California Univ. Santa Cruz, CA, United States)
Kumaresan, Sakthi Preethi (California Univ. Santa Cruz, CA, United States)
McIntosh, Dawn M. (NASA Ames Research Center Moffett Field, CA, United States)
Pontikakis, Emmanuel D. (California Univ. Santa Cruz, CA, United States)
Xu, Zuobing (California Univ. Santa Cruz, CA, United States)
Zhang, Yi (California Univ. Santa Cruz, CA, United States)
Date Acquired
August 23, 2013
Publication Date
January 1, 2006
Subject Category
Documentation And Information Science
Meeting Information
Meeting: IEEE Aerospace Conference
Location: Big Sky, MT
Country: United States
Start Date: March 4, 2006
End Date: March 11, 2006
Sponsors: Institute of Electrical and Electronics Engineers