NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
A Robust Machine Learning Schema for Developing, Maintaining, and Disseminating Machine Learning ModelsRecent advances in the development of machine learning (ML) algorithms have enabled the creation
of predictive models that can improve decision making, decrease computational cost, and improve
efficiency in a variety of fields. As an organization begins to develop and implement such models, the
data used in the training, validation, and testing of ML models, the model parameters, and the use cases or limitations of the models must be properly stored to ensure models are both fully traceable and used correctly. In the context of predicting material behavior, advances in computationally intense, physics-based modeling of material behavior at various length scales and the emergence of Integrated
Computational Materials Engineering (ICME) have driven the need for developing data-driven surrogate
models of the physics-based simulation tools using ML techniques. Surrogate model development allows for accurate material behavior prediction at a fraction of the cost of its physics-based counterpart, allowing for multiscale simulations of real-world applications, further enabling the ability to design fit-for-purpose materials for a reasonable computational investment. However, training such models requires extensive data, and thus, effective data management is necessary to reach the full potential that ML can offer to material design and ICME.

This paper proposes a generalized, robust schema that allows organizations to store both real
(experimental) and virtual (simulation) data used to train ML models and the defining model parameters
and architectures within the Granta MI Platform. The developed schema allows for various types of data
inputs and outputs, including single point values, time-series data, and images that can be used in the
prediction of material behavior, while following outlined best practices for effective data management.
An effective schema for ML data and models can help prevent the recreation of virtual/real training data
and surrogate models, help reduce the time to create new models similar to existing ones by offering a
starting point in the hyperparameter determination stages, minimize resources devoted to verification and validation (V&V) and certification of models, and ensure that data and surrogate models are not misused due to full traceability of both the data and ML model. It also allows organizations access to models that have already been developed, such that they can be used in the design of new materials, enabling the overall goals of ICME.
Document ID
20220017137
Acquisition Source
Glenn Research Center
Document Type
Technical Memorandum (TM)
Authors
Brandon L. Hearley
(Glenn Research Center Cleveland, Ohio, United States)
Steven M. Arnold
(Glenn Research Center Cleveland, Ohio, United States)
Joshua Stuckner
(Glenn Research Center Cleveland, Ohio, United States)
Date Acquired
November 14, 2022
Publication Date
December 1, 2022
Subject Category
Computer Systems
Systems Analysis and Operations Research
Report/Patent Number
E-20086
Funding Number(s)
WBS: 109492
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
Technical Review
Single Expert
No Preview Available