NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Methodology and Application of HPC I/O Characterization with MPIProf and IOTCombining the strengths of MPIProf and IOT, an efficient and systematic method is devised for I/O characterization at the per-job, per-rank, per-file and per-call levels of HPC programs running on the NASA Advanced Supercomputing Center. This method is applied to answer four I/O questions in this paper. A total of 13 MPI programs and 15 cases, ranging from 24 to 5968 ranks, are analyzed to establish the I/O landscape from answers to the four questions. Four of the 13 programs use MPI I/O and the behavior of their collective writes depends on the specific implementation of the MPI library used. The SGI MPT library, the prevailing MPI library for our systems, was found to gather small writes from a large number of ranks to perform larger writes by a small subset of collective buffering ranks. The number of collective buffering ranks invoked by MPT depends on the Lustre stripe count and the number of nodes used for the run. A demonstration of varying the stripe count to achieve double-digit speedup of one program's I/O was presented. Another program, which concurrently opens private files by all ranks and could potentially create a heavy load on the Lustre servers, was identified. The ability to systematically characterize I/O for a large number of programs running on a supercomputer, seek I/O optimization opportunity and identify programs that could cause a high load and instability on the filesystems is important for pursuing exascale in a real production environment.
Document ID
20190000271
Acquisition Source
Ames Research Center
Document Type
Conference Paper
Authors
Chang, Yan-Tyng Sherry
(Computer Sciences Corp. Moffett Field, CA, United States)
Jin, Henry
(NASA Ames Research Center Moffett Field, CA, United States)
Bauer, John
Date Acquired
January 31, 2019
Publication Date
November 13, 2018
Subject Category
Computer Systems
Report/Patent Number
ARC-E-DAA-TN35654
Meeting Information
Meeting: Supercomputing 2016 (SC16)
Location: Salt Lake City, UT
Country: United States
Start Date: November 13, 2016
End Date: November 18, 2016
Sponsors: Institute of Electrical and Electronics Engineers
Funding Number(s)
CONTRACT_GRANT: NNA07CA29C
Distribution Limits
Public
Copyright
Public Use Permitted.
Keywords
Characterization
Application
Methodology
No Preview Available