NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Due to the lapse in federal government funding, NASA is not updating this website. We sincerely regret this inconvenience.

Back to Results
Open-Source Data Engineering at NASA: CCMC's Approach to Managing Petabyte-Scale Heliophysics DataThe Community Coordinated Modeling Center (CCMC) at NASA Goddard Space Flight Center (GSFC) leads heliophysics research by providing open access to numerous models and their outputs. Our resources are available on-demand and continuously updated with real-time data, covering sun-earth interactions across multiple domains. These domains include coronal, heliosphere, inner and global magnetosphere, ionosphere, thermosphere, and lower atmosphere interactions.

Operating in a hybrid environment, CCMC utilizes both self-owned hardware and Amazon Web Services (AWS) cloud infrastructure. Managing petabytes of data across multiple locations necessitates robust data engineering solutions.

To address this challenge, CCMC has adopted industry-standard and open-source tools. We use Apache Airflow as our primary data engineering platform, Python for scripting and data processing, and GitLab for version control and CI/CD. Additionally, we employ Kubernetes for containerized services, Grafana and Prometheus for metrics and monitoring, and Terraform and Puppet for reproducible infrastructure as code.

This presentation will discuss lessons learned from our data engineering experiences, platforms evaluated but found unsuitable for our scientific data requirements, and specific techniques developed to enhance data transfer speed and reliability.

By using these technologies effectively, CCMC continues to advance heliophysics research through efficient data management and open-access modeling.
Document ID
20240015272
Acquisition Source
Goddard Space Flight Center
Document Type
Poster
Authors
Matthew Lesko
(Community Coordinated Modeling Center Greenbelt, United States)
Damian Barrous-Dume
(Community Coordinated Modeling Center Greenbelt, United States)
Masha Kuznetsova
(Goddard Space Flight Center Greenbelt, United States)
Polymnia Manessis
(Adnet Systems (United States) Bethesda, Maryland, United States)
M Leila Mays
(Goddard Space Flight Center Greenbelt, United States)
Phil Poole
(Adnet Systems (United States) Bethesda, Maryland, United States)
Karen Scheiber
(Adnet Systems (United States) Bethesda, Maryland, United States)
Edgar Russell
(Community Coordinated Modeling Center Greenbelt, United States)
Tina Tsui
(Community Coordinated Modeling Center Greenbelt, United States)
Chinwe Didigu
(Adnet Systems (United States) Bethesda, Maryland, United States)
Date Acquired
November 27, 2024
Subject Category
Documentation and Information Science
Space Sciences (General)
Meeting Information
Meeting: American Geophysical Union (AGU 2024)
Location: Washington, D.C.
Country: US
Start Date: December 9, 2024
End Date: December 13, 2024
Sponsors: American Geophysical Union
Funding Number(s)
CONTRACT_GRANT: 80GSFC20D0001
CONTRACT_GRANT: 80NSSC21M0180
CONTRACT_GRANT: 80GSFC23CA040
WBS: 382230.02.01.01.01.01
CONTRACT_GRANT: 80ARC018D0010
Distribution Limits
Public
Copyright
Use by or on behalf of the US Gov. Permitted.
Technical Review
NASA Peer Committee
Keywords
space weather
No Preview Available