Exploring Transfers Between Earth-Moon Halo Orbits via Multi-Objective Reinforcement Learning

Mashiku, Alinda K.; Stuart, Jeffrey R.; Bosanac, Natasha; Sullivan, Christopher J.; Anderson, Rodney L

Multi-Reward Proximal Policy Optimization, a multi-objective deep reinforcement learning algorithm, is used to examine the design space of low-thrust trajectories for a SmallSat transferring between two libration point orbits in the Earth- Moon system. Using Multi-Reward Proximal Policy Optimiza- tion, multiple policies are simultaneously and efficiently trained on three distinct trajectory design scenarios. Each policy is trained to create a unique control scheme based on the trajectory design scenario and assigned reward function: a unique combination of weights scaling competing objectives that guide the spacecraft to the target mission orbit, incentivize faster flight times, and penalize propellant mass usage. Then, the policies are evaluated on the same set of perturbed initial conditions in each scenario to generate the propellant mass usages, flight times, and state discontinuities from a reference trajectory for each control scheme. This solution space of low-thrust trajectories for a SmallSat is used to examine the multi-objective trade space for the trajectory design scenario. By autonomously constructing the solution space, insights into the required propellant mass, flight time, and transfer geometry are rapidly achieved.

Document ID

20220003773

Acquisition Source

Jet Propulsion Laboratory

Document Type

Preprint (Draft being sent to journal)

External Source(s)

hdl:2014/54244

Authors

Date Acquired

March 6, 2021

Publication Date

March 6, 2021

Publication Information

Publisher: Pasadena, CA: Jet Propulsion Laboratory, National Aeronautics and Space Administration, 2021

Distribution Limits

Public

Other

Technical Review

Available Downloads

There are no available downloads for this record.

No Preview Available

NTRS

NTRS - NASA Technical Reports Server

Available Downloads

Related Records