NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Exploring Transfers Between Earth-Moon Halo Orbits via Multi-Objective Reinforcement LearningMulti-Reward Proximal Policy Optimization, a multi-objective deep reinforcement learning algorithm, is used to examine the design space of low-thrust trajectories for a SmallSat transferring between two libration point orbits in the Earth- Moon system. Using Multi-Reward Proximal Policy Optimiza- tion, multiple policies are simultaneously and efficiently trained on three distinct trajectory design scenarios. Each policy is trained to create a unique control scheme based on the trajectory design scenario and assigned reward function: a unique combination of weights scaling competing objectives that guide the spacecraft to the target mission orbit, incentivize faster flight times, and penalize propellant mass usage. Then, the policies are evaluated on the same set of perturbed initial conditions in each scenario to generate the propellant mass usages, flight times, and state discontinuities from a reference trajectory for each control scheme. This solution space of low-thrust trajectories for a SmallSat is used to examine the multi-objective trade space for the trajectory design scenario. By autonomously constructing the solution space, insights into the required propellant mass, flight time, and transfer geometry are rapidly achieved.
Document ID
20220003773
Acquisition Source
Jet Propulsion Laboratory
Document Type
Preprint (Draft being sent to journal)
External Source(s)
Authors
Mashiku, Alinda K.
Stuart, Jeffrey R.
Bosanac, Natasha
Sullivan, Christopher J.
Anderson, Rodney L
Date Acquired
March 6, 2021
Publication Date
March 6, 2021
Publication Information
Publisher: Pasadena, CA: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2021
Distribution Limits
Public
Copyright
Other
Technical Review

Available Downloads

There are no available downloads for this record.
No Preview Available