NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Harnessing Large Language Models for Scientific EndeavorsThe rapid proliferation of Large Language Models (LLMs) such as GPT, Bard, and Llama has revolutionized various sectors, including the scientific community. These models, with their potential to automate and augment tasks, are increasingly being recognized as both a valuable asset and a potential challenge in the realm of scientific research and data management. However, the current LLMs, primarily trained on general corpora, exhibit a limited understanding of scientific concepts and terminologies due to the lack of scientific corpus in their training data.

Recognizing this gap, several groups are now advocating for the development of LLMs specifically tailored for scientific applications. A notable initiative in this direction is the Large Language Model effort initiated by NASA's CSDO. This endeavor aims to align LLM efforts across NASA’s Science Mission Directorate, develop a science-specific corpus and validation test set for model training, and create an encoder-only model for various downstream tasks.

Moreover, the initiative also plans to develop a decoder-only model to explore the potential benefits and risks associated with a generative LLM for science. Lastly, the project aims to create a science evaluation suite, encompassing various categories of downstream scientific tasks, to serve as a benchmark for assessing the value of any LLM for future use.

This presentation will provide an overview and current status of this ongoing initiative, highlighting its potential to reshape the use of LLMs in the scientific domain.
Document ID
20230016512
Acquisition Source
Marshall Space Flight Center
Document Type
Presentation
Authors
Rahul Ramachandran
(National Aeronautics and Space Administration Washington D.C., District of Columbia, United States)
Manil Maskey
(Marshall Space Flight Center Redstone Arsenal, Alabama, United States)
Kaylin Bugbee
(Marshall Space Flight Center Redstone Arsenal, Alabama, United States)
Mike Little
(LS Technologies )
Elizabeth Fancher
(Barrios Technology Houston, Texas, United States)
Muthukumaran Ramasubramanian
(University of Alabama in Huntsville Huntsville, Alabama, United States)
Bishwaranjan Bhattaacharjee
(IBM Research – Thomas J. Watson Research Center Yorktown Heights, New York, United States)
Raghu Ganti
(IBM Research – Thomas J. Watson Research Center Yorktown Heights, New York, United States)
Avi Sil
(IBM Research – Thomas J. Watson Research Center Yorktown Heights, New York, United States)
Lauren Sanders
(Blue Marble Space Seattle, Washington, United States)
Sylvain Costes
(Ames Research Center Mountain View, California, United States)
Sergio Blanco-Cuaresma
(Harvard-Smithsonian Center for Astrophysics Cambridge, Massachusetts, United States)
Kelly Lockhart
(Harvard-Smithsonian Center for Astrophysics Cambridge, Massachusetts, United States)
Thomas Allen
(Center for Astrophysics Harvard & Smithsonian Cambridge, Massachusetts, United States)
Felix Grazes
(Center for Astrophysics Harvard & Smithsonian Cambridge, Massachusetts, United States)
Megan Ansdell
(National Aeronautics and Space Administration Washington D.C., District of Columbia, United States)
Alberto Accomazzi
(Harvard-Smithsonian Center for Astrophysics Cambridge, Massachusetts, United States)
Tsengdar Lee
(National Aeronautics and Space Administration Washington D.C., District of Columbia, United States)
Sanaz Vahidinia
(National Aeronautics and Space Administration Washington, United States)
Ryan McGranaghan
(Jet Propulsion Laboratory La Cañada Flintridge, United States)
Armin Mehrabian
(Adnet Systems (United States) Bethesda, Maryland, United States)
Date Acquired
November 13, 2023
Subject Category
Documentation and Information Science
Meeting Information
Meeting: 23rd Meeting of the American Geophysical Union (AGU)
Location: San Francisco, CA
Country: US
Start Date: December 11, 2023
End Date: December 15, 2023
Sponsors: American Geophysical Union
Funding Number(s)
CONTRACT_GRANT: 80MSFC22M0004
Distribution Limits
Public
Copyright
Portions of document may include copyright protected material.
Technical Review
Single Expert
No Preview Available