NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Empirical Analysis and Automated Classification of Security Bug ReportsWith the ever expanding amount of sensitive data being placed into computer systems, the need for effective cybersecurity is of utmost importance. However, there is a shortage of detailed empirical studies of security vulnerabilities from which cybersecurity metrics and best practices could be determined. This thesis has two main research goals: (1) to explore the distribution and characteristics of security vulnerabilities based on the information provided in bug tracking systems and (2) to develop data analytics approaches for automatic classification of bug reports as security or non-security related. This work is based on using three NASA datasets as case studies. The empirical analysis showed that the majority of software vulnerabilities belong only to a small number of types. Addressing these types of vulnerabilities will consequently lead to cost efficient improvement of software security. Since this analysis requires labeling of each bug report in the bug tracking system, we explored using machine learning to automate the classification of each bug report as a security or non-security related (two-class classification), as well as each security related bug report as specific security type (multiclass classification). In addition to using supervised machine learning algorithms, a novel unsupervised machine learning approach is proposed. An ac- curacy of 92%, recall of 96%, precision of 92%, probability of false alarm of 4%, F-Score of 81% and G-Score of 90% were the best results achieved during two-class classification. Furthermore, an accuracy of 80%, recall of 80%, precision of 94%, and F-score of 85% were the best results achieved during multiclass classification.
Document ID
20160014477
Acquisition Source
Goddard Space Flight Center
Document Type
Thesis/Dissertation
Authors
Tyo, Jacob P.
(West Virginia Univ. Morgantown, WV, United States)
Date Acquired
December 6, 2016
Publication Date
January 1, 2016
Subject Category
Mathematical And Computer Sciences (General)
Report/Patent Number
GSFC-E-DAA-TN37712
Funding Number(s)
CONTRACT_GRANT: NNG12SA03C
Distribution Limits
Public
Copyright
Public Use Permitted.
Keywords
Cybersecurity
No Preview Available