NASA Logo

NTRS

NTRS - NASA Technical Reports Server

Back to Results
Multiphase complete exchange on a circuit switched hypercubeOn a distributed memory parallel computer, the complete exchange (all-to-all personalized) communication pattern requires each of n processors to send a different block of data to each of the remaining n - 1 processors. This pattern is at the heart of many important algorithms, most notably the matrix transpose. For a circuit switched hypercube of dimension d(n = 2(sup d)), two algorithms for achieving complete exchange are known. These are (1) the Standard Exchange approach that employs d transmissions of size 2(sup d-1) blocks each and is useful for small block sizes, and (2) the Optimal Circuit Switched algorithm that employs 2(sup d) - 1 transmissions of 1 block each and is best for large block sizes. A unified multiphase algorithm is described that includes these two algorithms as special cases. The complete exchange on a hypercube of dimension d and block size m is achieved by carrying out k partial exchange on subcubes of dimension d(sub i) Sigma(sup k)(sub i=1) d(sub i) = d and effective block size m(sub i) = m2(sup d-di). When k = d and all d(sub i) = 1, this corresponds to algorithm (1) above. For the case of k = 1 and d(sub i) = d, this becomes the circuit switched algorithm (2). Changing the subcube dimensions d, varies the effective block size and permits a compromise between the data permutation and block transmission overhead of (1) and the startup overhead of (2). For a hypercube of dimension d, the number of possible combinations of subcubes is p(d), the number of partitions of the integer d. This is an exponential but very slowly growing function and it is feasible over these partitions to discover the best combination for a given message size. The approach was analyzed for, and implemented on, the Intel iPSC-860 circuit switched hypercube. Measurements show good agreement with predictions and demonstrate that the multiphase approach can substantially improve performance for block sizes in the 0 to 160 byte range. This range, which corresponds to 0 to 40 floating point numbers per processor, is commonly encountered in practical numeric applications. The multiphase technique is applicable to all circuit-switched hypercubes that use the common e-cube routing strategy.
Document ID
19910010403
Acquisition Source
Legacy CDMS
Document Type
Contractor Report (CR)
Authors
Bokhari, Shahid H.
(Institute for Computer Applications in Science and Engineering Hampton, VA, United States)
Date Acquired
September 6, 2013
Publication Date
January 1, 1991
Subject Category
Mathematical And Computer Sciences (General)
Report/Patent Number
ICASE-91-5
NASA-CR-187499
AD-A232830
NAS 1.26:187499
Report Number: ICASE-91-5
Report Number: NASA-CR-187499
Report Number: AD-A232830
Report Number: NAS 1.26:187499
Accession Number
91N19716
Funding Number(s)
CONTRACT_GRANT: NAS1-18605
PROJECT: RTOP 505-90-52-01
Distribution Limits
Public
Copyright
Work of the US Gov. Public Use Permitted.
No Preview Available