NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Re-Organizing Earth Observation Data Storage to Support Temporal Analysis of Big DataThe Earth Observing System Data and Information System archives many datasets that are critical to understanding long-term variations in Earth science properties. Thus, some of these are large, multi-decadal datasets. Yet the challenge in long time series analysis comes less from the sheer volume than the data organization, which is typically one (or a small number of) time steps per file. The overhead of opening and inventorying complex, API-driven data formats such as Hierarchical Data Format introduces a small latency at each time step, which nonetheless adds up for datasets with O(10^6) single-timestep files. Several approaches to reorganizing the data can mitigate this overhead by an order of magnitude: pre-aggregating data along the time axis (time-chunking); storing the data in a highly distributed file system; or storing data in distributed columnar databases. Storing a second copy of the data incurs extra costs, so some selection criteria must be employed, which would be driven by expected or actual usage by the end user community, balanced against the extra cost.
Document ID
20170012171
Acquisition Source
Goddard Space Flight Center
Document Type
Presentation
Authors
Lynnes, Christopher
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Date Acquired
December 15, 2017
Publication Date
December 11, 2017
Subject Category
Computer Systems
Report/Patent Number
GSFC-E-DAA-TN49796
Meeting Information
Meeting: AGU Fall Meeting
Location: New Orleans, LA
Country: United States
Start Date: December 11, 2017
End Date: December 15, 2017
Sponsors: American Geophysical Union
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
Keywords
data storage
big dat
temporal analysis
No Preview Available