NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Hadoop for High-Performance Climate Analytics: Use Cases and Lessons LearnedScientific data services are a critical aspect of the NASA Center for Climate Simulations mission (NCCS). Hadoop, via MapReduce, provides an approach to high-performance analytics that is proving to be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. The NCCS is particularly interested in the potential of Hadoop to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we prototyped a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. The initial focus was on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. After preliminary results suggested that this approach improves efficiencies within data intensive analytic workflows, we invested in building a cyber infrastructure resource for developing a new generation of climate data analysis capabilities using Hadoop. This resource is focused on reducing the time spent in the preparation of reanalysis data used in data-model inter-comparison, a long sought goal of the climate community. This paper summarizes the related use cases and lessons learned.
Document ID
20170000324
Acquisition Source
Goddard Space Flight Center
Document Type
Presentation
Authors
Tamkin, Glenn
(Computer Sciences Corp. Greenbelt, MD, United States)
Date Acquired
January 11, 2017
Publication Date
June 26, 2013
Subject Category
Computer Systems
Report/Patent Number
GSFC-E-DAA-TN10379
Report Number: GSFC-E-DAA-TN10379
Meeting Information
Meeting: Hadoop Summit North America 2013
Location: San Jose, CA
Country: United States
Start Date: June 26, 2013
End Date: June 27, 2013
Sponsors: Hortonworks
Funding Number(s)
CONTRACT_GRANT: NNG13HQ01C
Distribution Limits
Public
Copyright
Public Use Permitted.
Keywords
analytics
HADOOP
climate
No Preview Available