NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Transcriptomics Processing Pipelines for Space Biology: An Open Source and Consensus-Driven ApproachTranscriptomics holds significant value in elucidating the relationship between gene expression, experimental factors, biological factors, and various types of omics data. Enhancing our understanding of these connections is paramount for foundational biology, which plays a pivotal role in devising solutions for challenges pertinent to both space travel and terrestrial life.
The NASA GeneLab project, part of the Open Science Data Repository (OSDR.nasa.gov), seeks to accelerate space biology research through cataloging and democratizing ‘omics data, including transcriptomics. Since raw omics data are largely inaccessible to non-bioinformaticians, GeneLab works with the scientific community via the Open Science Analysis Working Groups (AWGs) to develop standard processing pipelines to generate and publish processed data. Unlike raw data, processed data have greater immediate value to diverse users with varying technical backgrounds and computational capabilities. Standardizing processing workflows is essential to match the pace of raw data generation, ensure reproducibility, and enable standardized processed data for comparison across datasets.
As of June 2023, transcriptomics studies comprise over half of GeneLab datasets hosted on the OSDR, including data from bulk RNA-seq and Affymetrix or Agilent 1-Channel DNA microarray assays. In collaboration with the AWGs, GeneLab developed consensus processing pipelines for these transcriptomics data types that includes quality control, background correction (microarray only), data normalization and quantification, culminating in the detection and annotation of differentially expressed genes. The work presented here describes Nextflow implementations of GeneLab’s consensus transcriptomics pipelines that automates and accelerates processing of these datasets. In addition to the core data processing, these workflows also include raw data staging and a robust verification and validation program to identify errors in real-time, stop additional downstream computation, and preserve computational resources. These workflows are used to generate GeneLab processed data hosted on the OSDR, and are publicly available as open source software for others to use at: https://github.com/nasa/GeneLab_Data_Processing.
Document ID
20230016091
Acquisition Source
Ames Research Center
Document Type
Poster
Authors
Jonathan Oribello
(Blue Marble Space Seattle, Washington, United States)
Amanda M Saravia-Butler
(KBR (United States) Houston, Texas, United States)
Lauren M Sanders
(Blue Marble Space Seattle, Washington, United States)
Michael D Lee
(KBR (United States) Houston, Texas, United States)
Samrawit Gebre
(KBR (United States) Houston, Texas, United States)
Sylvain V Costes
(Ames Research Center Mountain View, California, United States)
Date Acquired
November 6, 2023
Subject Category
Documentation and Information Science
Meeting Information
Meeting: Annual Meeting of the American Society for Gravitational and Space Research Conference
Location: Washington, DC
Country: US
Start Date: November 14, 2023
End Date: November 18, 2023
Sponsors: American Society for Gravitational and Space Research
Funding Number(s)
WBS: 719125.06.01.02.01.02
Distribution Limits
Public
Copyright
Public Use Permitted.
Technical Review
Single Expert
No Preview Available