NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Performance and Application of Parallel OVERFLOW Codes on Distributed and Shared Memory PlatformsThe presentation discusses recent studies on the performance of the two parallel versions of the aerodynamics CFD code, OVERFLOW_MPI and _MLP. Developed at NASA Ames, the serial version, OVERFLOW, is a multidimensional Navier-Stokes flow solver based on overset (Chimera) grid technology. The code has recently been parallelized in two ways. One is based on the explicit message-passing interface (MPI) across processors and uses the _MPI communication package. This approach is primarily suited for distributed memory systems and workstation clusters. The second, termed the multi-level parallel (MLP) method, is simple and uses shared memory for all communications. The _MLP code is suitable on distributed-shared memory systems. For both methods, the message passing takes place across the processors or processes at the advancement of each time step. This procedure is, in effect, the Chimera boundary conditions update, which is done in an explicit "Jacobi" style. In contrast, the update in the serial code is done in more of the "Gauss-Sidel" fashion. The programming efforts for the _MPI code is more complicated than for the _MLP code; the former requires modification of the outer and some inner shells of the serial code, whereas the latter focuses only on the outer shell of the code. The _MPI version offers a great deal of flexibility in distributing grid zones across a specified number of processors in order to achieve load balancing. The approach is capable of partitioning zones across multiple processors or sending each zone and/or cluster of several zones into a single processor. The message passing across the processors consists of Chimera boundary and/or an overlap of "halo" boundary points for each partitioned zone. The MLP version is a new coarse-grain parallel concept at the zonal and intra-zonal levels. A grouping strategy is used to distribute zones into several groups forming sub-processes which will run in parallel. The total volume of grid points in each group are approximately balanced. A proper number of threads are initially allocated to each group, and in subsequent iterations during the run-time, the number of threads are adjusted to achieve load balancing across the processes. Each process exploits the multitasking directives already established in Overflow.
Document ID
19990019852
Document Type
Conference Paper
Authors
Djomehri, M. Jahed (Calspan Corp. Moffett Field, CA United States)
Rizk, Yehia M. (NASA Ames Research Center Moffett Field, CA United States)
Date Acquired
August 19, 2013
Publication Date
January 1, 1999
Publication Information
Publication: HPCCP/CAS Workshop Proceedings 1998
Subject Category
Computer Systems
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.

Related Records

IDRelationTitle19990019831Analytic PrimaryHPCCP/CAS Workshop Proceedings 1998