NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Explainable Machine Learning for Ocean Worlds Biosignatures and Seawater ChemistryExplainable machine learning (ML) methods are needed to support future missions to ocean worlds (OWs) such as Europa and Enceladus for biosignature detection and chemical characterization. Explainable ML methods allow black box model predictions to be interpreted or explained. Such methods are essential for resource-constrained geochemical and astrobiological missions to OWs in which the environment is so remote. Additionally, the stakes of false predictions are high in the detection of an extraterrestrial biosignature, and so ML predictions must include an explanation and an assessment of confidence and trustworthiness to reduce risk and protect mission resources.

For planetary exploration, mass spectrometry (MS) is a ubiquitous tool due to its accuracy and rich spectral data products that provide valuable information about planetary surfaces and subsurfaces, including quantification of elemental and isotopic composition. Elemental and isotopic information allows scientists to deduce crucial geochemical facts about planetary bodies, such as the history, origins, fate, and potential biological content of extraterrestrial rocks, volatiles, and liquid water. However, MS data products can be high-dimensional, large in terms of computer memory required for storage, and difficult to process in large sample batches. Because fractionation of light carbon and oxygen isotopes are known to be indicative of microbial and photosynthetic life on Earth, this research focuses on developing data processing tools and ML algorithms for isotope ratio mass spectrometry (IRMS) measurements of volatile CO2 for future astrobiological investigations of OWs.

This dissertation introduces novel ML methods for the prediction and explanation of biosignatures and seawater chemistry from laboratory derived IRMS data with biotic and abiotic signatures as well as simulated data. The ML methods introduced in this dissertation include novel time-series feature construction, a novel distance metric approach with feature selection that results in a human-interpretable variable (feature) space. In addition, a novel local variable importance method is developed that provides a level of explanation for the prediction of a single sample. This local Nearest-neighbors Projected Distance Regression method (local-NPDR) can detect statistical interactions and can be used to diagnose potential false predictions. We use these methods as well as an interpretable biosignature network visualization to explain predictions by biosignature and seawater chemistry models for OW analogue brine salt components, volatile CO2 concentration, pH, and ionic strength. A quality analysis/quality control (QA/QC) data processing tool is demonstrated in a simulated Enceladus mission concept to illustrate real time experimental data processing for use in biosignature classification and seawater chemistry models. The local feature importance method is demonstrated on simulated and real geochemical isotopic data and will be demonstrated in the field. While primarily focused on ML applications for icy OWs, the explainable ML methods presented here may be applied to other scientific datasets for any number of planetary environments or analogues. The explainable ML methods presented here are expected to be practical tools that will increase trust for future autonomous planetary exploration.
Document ID
20250007012
Acquisition Source
Goddard Space Flight Center
Document Type
Thesis/Dissertation
Authors
Lily A Clough
(University of Tulsa Tulsa, Oklahoma, United States)
Date Acquired
July 14, 2025
Publication Date
August 29, 2025
Publication Information
Publisher: ProQuest
Issue Publication Date: August 29, 2025
Subject Category
Exobiology
Geosciences (General)
Funding Number(s)
CONTRACT_GRANT: 80NSSC24M0109
Distribution Limits
Public
Copyright
Public Use Permitted.
Keywords
machine learning
feature importance
biosignatures
No Preview Available