NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Automated classification of scientific publications linked to GES DISC datasetsThe data collections archived and distributedby the GES DISC NASA data center arewidely utilized for various Earth Science studies.As these collections are created, many researchworks are published regarding the collections, algorithms,validations and applications. SinceGES DISC collects these publications and providestheir citations for the users, it is helpful tocategorize them based on how they relate to the datasetsthey are associated with. Specifically,whether the publication that is linked to GES DISCdataset is using it for applicational research,or if it describes the algorithm for dataset creation,or the validation of the dataset, or providesthe general overview of the data collection. Currently,this process requires simple manuallabelling, and as such, may be possible to solve viaautomation. To approach this problem, wedeveloped machine learning classifiers to predictthe category a publication belongs to. We usedmanually labeled publications as training data forsupervised machine learning algorithms:Random Forest and Naive Bayes. We achieved classificationaccuracy that is substantially betterthan the baseline accuracy, thus greatly improvingthe efficiency of the publication internalanalysis.
Document ID
20210019014
Acquisition Source
Goddard Space Flight Center
Document Type
Poster
Authors
Rohan Dayal
(GSFC INTERNS)
Irina Gerasimov
(Adnet Systems (United States) Bethesda, Maryland, United States)
Armin Mehrabian
(ADNET SYSTEMS INC)
Jennifer Wei
(Goddard Space Flight Center Greenbelt, Maryland, United States)
Mohammad Khayat
(Adnet Systems (United States) Bethesda, Maryland, United States)
Andrey Savtchenko
(Adnet Systems (United States) Bethesda, Maryland, United States)
Date Acquired
July 22, 2021
Subject Category
Computer Programming And Software
Meeting Information
Meeting: Earth Science Information Partners (ESIP) Summer 2021 Conference
Location: Online
Country: US
Start Date: July 19, 2021
End Date: July 23, 2021
Sponsors: Federation of Earth Science Information Partners
Funding Number(s)
WBS: 656052.04.01.08.04
Distribution Limits
Public
Copyright
Public Use Permitted.
Technical Review
NASA Peer Committee
No Preview Available