NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Provenance in Data Interoperability for Multi-Sensor IntercomparisonAs our inventory of Earth science data sets grows, the ability to compare, merge and fuse multiple datasets grows in importance. This requires a deeper data interoperability than we have now. Efforts such as Open Geospatial Consortium and OPeNDAP (Open-source Project for a Network Data Access Protocol) have broken down format barriers to interoperability; the next challenge is the semantic aspects of the data. Consider the issues when satellite data are merged, cross-calibrated, validated, inter-compared and fused. We must match up data sets that are related, yet different in significant ways: the phenomenon being measured, measurement technique, location in space-time or quality of the measurements. If subtle distinctions between similar measurements are not clear to the user, results can be meaningless or lead to an incorrect interpretation of the data. Most of these distinctions trace to how the data came to be: sensors, processing and quality assessment. For example, monthly averages of satellite-based aerosol measurements often show significant discrepancies, which might be due to differences in spatio- temporal aggregation, sampling issues, sensor biases, algorithm differences or calibration issues. Provenance information must be captured in a semantic framework that allows data inter-use tools to incorporate it and aid in the intervention of comparison or merged products. Semantic web technology allows us to encode our knowledge of measurement characteristics, phenomena measured, space-time representation, and data quality attributes in a well-structured, machine-readable ontology and rulesets. An analysis tool can use this knowledge to show users the provenance-related distrintions between two variables, advising on options for further data processing and analysis. An additional problem for workflows distributed across heterogeneous systems is retrieval and transport of provenance. Provenance may be either embedded within the data payload, or transmitted from server to client in an out-of-band mechanism. The out of band mechanism is more flexible in the richness of provenance information that can be accomodated, but it relies on a persistent framework and can be difficult for legacy clients to use. We are prototyping the embedded model, incorporating provenance within metadata objects in the data payload. Thus, it always remains with the data. The downside is a limit to the size of provenance metadata that we can include, an issue that will eventually need resolution to encompass the richness of provenance information required for daata intercomparison and merging.
Document ID
20090005968
Acquisition Source
Goddard Space Flight Center
Document Type
Presentation
Authors
Lynnes, Chris
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Leptoukh, Greg
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Berrick, Steve
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Shen, Suhung
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Prados, Ana
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Fox, Peter
(George Mason Univ. Greenbelt, MD, United States)
Yang, Wenli
(George Mason Univ. Greenbelt, MD, United States)
Min, Min
(NASA Goddard Space Flight Center Greenbelt, MD, United States)
Holloway, Dan
(OPeNDAP, Inc. Narragansett, RI, United States)
Enloe, Yonsook
(SGT, Inc. Greenbelt , MD, United States)
Date Acquired
August 24, 2013
Publication Date
December 15, 2008
Subject Category
Computer Programming And Software
Meeting Information
Meeting: American Geophysical Union Meeting
Location: San Francisco, CA
Country: United States
Start Date: December 15, 2008
End Date: December 19, 2008
Distribution Limits
Public
Copyright
Public Use Permitted.
No Preview Available