NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Due to the lapse in federal government funding, NASA is not updating this website. We sincerely regret this inconvenience.

Back to Results
Batch Effect Correction Methods for NASA GeneLab Transcriptomic DatasetsRNA sequencing (RNA-seq) data from space biology experiments promise to yield invaluable insights into the effects of spaceflight on terrestrial biology. However, sample numbers from each study are low due to limited crew availability, hardware, and space. To increase statistical power, spaceflight RNA-seq datasets from different missions are often aggregated together. However, this can introduce technical variation or "batch effects", often due to differences in sample handling, sample processing, and sequencing platforms. Several computational methods have been developed to correct for technical batch effects, thereby reducing their impact on true biological signals.
In this study, we combined 7 mouse liver RNA-seq datasets from NASA GeneLab (part of the NASA Open Science Data Repository) to evaluate several common batch effect correction methods (ComBat and ComBat-seq from the sva R package, and Median Polish, Empirical Bayes, and ANOVA from the MBatch R package). We quantitatively evaluated the ability of these methods to correct for technical batch variables in space biology RNA-seq data using the following criteria: BatchQC, principal component analysis, dispersion separability criterion, log fold change correlation, and differential gene expression analysis. Each batch variable / correction method combination was then assessed using a custom scoring approach to identify the optimal correction method for the combined dataset, by geometrically probing the space of all allowable scoring functions to yield an aggregate volume-based scoring measure.
Finally, we describe the way in which the GeneLab multi-study analysis and visualization portal will allow users to examine the presence or absence of batch effects using multiple metrics. If the user chooses to perform batch effect correction, the scoring approach described here can be implemented to identify the optimal correction method to use for their specific combined dataset prior to analysis.
Document ID
20230009504
Acquisition Source
Ames Research Center
Document Type
Presentation
Authors
Lauren M. Sanders
(Blue Marble Space Seattle, Washington, United States)
Hamed Chok
(Science Collaborator)
Finsam Samson
(Stanford University Stanford, California, United States)
Ana Uriarte Acuna
(Wyle (United States) El Segundo, California, United States)
San-huei Lai Polo
(Wyle (United States) El Segundo, California, United States)
Valery Boyko
(Wyle (United States) El Segundo, California, United States)
Yi-Chun Chen
(Wyle (United States) El Segundo, California, United States)
Marie Dinh
(Wyle (United States) El Segundo, California, United States)
Samrawit Gebre
(Wyle (United States) El Segundo, California, United States)
Jonathan M. Galazka
(Ames Research Center Mountain View, California, United States)
Sylvain V. Costes
(Ames Research Center Mountain View, California, United States)
Amanda M. Saravia-Butler
(Wyle (United States) El Segundo, California, United States)
Date Acquired
June 26, 2023
Subject Category
Space Sciences (General)
Meeting Information
Meeting: Annual Meeting of the American Society for Gravitational and Space Research
Location: Washington, DC
Country: US
Start Date: November 14, 2023
End Date: November 18, 2023
Sponsors: American Society for Gravitational and Space Research
Funding Number(s)
WBS: 719125.06.01.02.01.02
Distribution Limits
Public
Copyright
Public Use Permitted.
Technical Review
Single Expert
No Preview Available