PARALLEL PROCESSING METHODS FOR SPACE BASED POWER SYSTEMS

Final Report
NASA/ASEE Summer Faculty Fellowship Program--1993
Johnson Space Center

Prepared By: F.C. Berry
Academic Rank: Associate Professor
University & Department: Department of Electrical Engineering
Louisiana Tech University
Ruston, Louisiana 71272

NASA/JSC
Directorate: Engineering
Division: Propulsion And Power
Branch: Power
JSC Colleagues: Bob Hendrix
Tom Jeffcoat
Data Submitted: 7-30-93
Contract Number: NGT-44-001-800
ABSTRACT

This report presents a method for doing load-flow analysis of a power system by using a decomposition approach. The power system for the Space Shuttle is used as a basis to build a model for the load-flow analysis. To test the decomposition method for doing load-flow analysis, simulations were performed on power systems of 16, 25, 34, 43, 52, 61, 70, and 79 nodes. Each of the power systems was divided into subsystems and simulated under steady-state conditions. The results from these tests have been found to be as accurate as tests performed using a standard serial simulator. The division of the power systems into different subsystems was done by assigning a processor to each area. There were 13 transputers available, therefore, up to 13 different subsystems could be simulated at the same time.

This report has preliminary results for a load-flow analysis using a decomposition principal. The report shows that the decomposition algorithm for load-flow analysis is well suited for parallel processing and provides increases in the speed of execution.

INTRODUCTION

A research project, Advanced Electrical Power Management Techniques for Space Systems (ADEPTS), was started at the Johnson Space Center in 1986. The basic goal of ADEPTS was to automate the operations of a space based power system (SBPS) by using the technology of parallel and distributed processing.

From the ADEPTS project, three basic functions were identified which would form the basis of a management system for a SBPS. First was the monitoring of the power system that could be accomplished by the use of state estimation. Second was the scheduling of the generation to meet the required load of the SBPS that could be accomplished by unit dispatch. Third was the solution of the SBPS that could be accomplished by the use of load-flow analysis.

Methods like state estimation, unit dispatch, load-flow analysis, etc., are well-established tools that have been used heavily by the electric power industry since the introduction of the digital computer. However, the use of parallel processing is still relatively new when applied to the area of power systems [1]. In the last few years, a significant amount of research has been done in the area of parallel processing of power system problems. Most of this work has been in the development of algorithms. Actual testing on multiprocessor architectures is still near the beginning stages. The largest uncertainty today is the evolution of the hardware. For example:

1. Pipelined computers in which temporal parallelism is used to calculate overlapped computations. Early Cray-1 and Cyber 205 are good examples of these systems.

2. Array processors in which spatial parallelism is utilized through synchronized arithmetic logical units (ALU). The Illiac-IV is a good example of this system.

3. Parallel processing systems in which asynchronous parallelism is employed at the software level with the Cray X-MP and Cray 2 as typical examples.

Shared-memory and distributed processors are two basic systems of parallel computers that can offer increases in computational power but their acceptability and long term viability is unpredictable [1].

Because parallel algorithm development was much further along, ADEPTS chose to make use of the existing algorithms and focused on tailoring these algorithms to a
specific hardware design. To accomplish this, a hardware platform and software environment had to be decided upon. The INMOS T800 transputer was used as the basic hardware platform. The T800 transputer is a 32 bit reduced instruction set computer (RISC) running at 25 MHz [2]. The T800 transputer can communicate with other transputers or a host computer by a set of four bi-directional channels. By using these channels, a variety of hardware architecture can be investigated. Also, additional transputer modules can be easily added in the future to further enhance the system without making obsolete the existing transputers. Because of the difficulties encountered in developing parallel software, the Express® “operating system” was chosen. The Express package is best classified as a transparent parallel operating system with a set of tools and utilities for developing parallel programs [3, 4, 11].

A preliminary parallel architecture for a management system of a SBPS (Figure 1) was developed and tested using the power system of the Space Shuttle (Orbiter) as a model. The electric power distribution system that is present on the Orbiter is made up of three strings each having a fuel cell that provides DC voltage to those systems that require DC, and the inverters that convert the DC to AC for those elements that require AC. Figure 2 is a block diagram representation of the power system of the Orbiter.

![Block Diagram Of A Power Management System](image)

Figure 1: Block Diagram Of A Power Management System.

4-3
Preliminary tests of the architecture of Figure 1, using the Orbiter power system, showed that the load-flow program took from 20% to 50% of total processing time depending on the size of the electrical circuit that the load-flow program was solving. To improve the overall performance of the architecture of Figure 1, a different architecture for the load-flow was investigated.

![Diagram of Power Distribution System for the Space Shuttle (1 of 3 Strings).](image)

**Figure 2: Power Distribution System for the Space Shuttle (1 of 3 Strings).**

**RELATED RESEARCH**

A general definition of load-flow is given: Load-flow is the steady-state solution of equations that describe a power system. These equations can be nonlinear and algebraic. They can be nonlinear because they express powers as a function of voltages. They are algebraic because they describe the steady-state instead of the transient behavior of the power system. [5]

Rafian, Sterling, and Irving [6], presented a method for load-flow analysis for power systems by tearing the network into a number of independent subsystems. These subsystems could then be solved in parallel resulting in a saving of time for on-line control. This method of decomposed load-flow analysis was based on the fast decoupled load-flow algorithm. The decomposed load-flow technique that was presented would
normally give the greatest reduction in time, if the system could be divided such that the
subsystems were interconnected only by a few tie lines (separating nodes).

Rafian, Sterling, and Irving [7], present a mathematical technique for the
decomposition of a set of nonlinear algebraic and differential equations which result from
a corresponding dynamic model of an electric power system. This method of load-flow
analysis for power systems again tears the network into a number of independent
subsystems. These subsystems could then be executed in parallel. This method of
decomposed load-flow analysis was based on the Newton-Raphson load-flow algorithm.
It was also shown that this decomposed load-flow algorithm was suitable for
implementation on parallel processors with a 32-bit word capability. This decomposition
technique resulted in significant savings in the simulation elapsed time.

Wang, Xiang, Wang, and Huang [8], presented two algorithms for a parallel
solution of the reduced gradient optimal power flow. The parallel solution method for
optimal power flow was accomplished by tearing the network into a number of
independent subsystems. These subsystems were then solved in parallel by using what
the authors called a two-level computer network.

Berry and Cox [9], presented a coordination decomposition method for load-flow
analysis using the Gauss-Siedel algorithm. This coordination decomposition method was
accomplished by the tearing of the electrical network into a number of independent
subsystems which could then be solved in parallel. However, this load-flow algorithm
was only simulated on a serial computer and no true parallel processing took place.

Taoka, Iyoda, Noguchi, Sato, and Nakazawa [10], presented a Gauss-Siedel
method for doing load-flow analysis on a hyper cube computer. The algorithm that was
presented consisted of solving the power flow equations in an iterative manner in order to
minimize the communication between nodes.

LOAD-FLOW

Load-flow is the name given to a network solution that shows currents, voltages,
and power flow at every bus in the system. In the load-flow problem, a relationship
between voltage and current at each bus exists and the load-flow calculation must solve
for all voltages and currents such that these relationships are satisfied. As such, the load-
flow gives the electrical response of the transmission system to a particular set of loads
and generator unit outputs. Therefore, load-flow is the solution of an electrical network
that gives the values of currents, voltages, and power flow at every bus (node) in the
electrical power system.

To start the load-flow solution process, a set of linear equations is formulated for
an electrical network. This linear set of equations is generally of the node form as
represented by equations (1).

\[
\begin{bmatrix}
I_1 \\
I_2 \\
\vdots \\
I_i \\
\vdots \\
I_n \\
\end{bmatrix}
= 
\begin{bmatrix}
Y_{11} & Y_{12} & \ldots & Y_{1j} \\
Y_{21} & Y_{22} & \ldots & \ldots \\
\vdots & \vdots & \ddots & \vdots \\
Y_{i1} & \vdots & \ldots & Y_{ij} \\
\end{bmatrix}
\begin{bmatrix}
E_1 \\
E_2 \\
\vdots \\
E_i \\
\vdots \\
E_n \\
\end{bmatrix}
\]

Although currents entering the nodes from generators and load branches are not known,
they can be written in terms of P and E.
\[ I_i = \frac{P_i}{E_i} \quad (2) \]

Since currents entering the nodes are considered positive, then power flow into the node is also considered positive, and power dissipated in the resistors would be entered as a negative number.

Substituting equation 2 into equation 1 gives the following result.

\[
\begin{bmatrix}
I_1 \\
I_2 \\
\vdots \\
I_i
\end{bmatrix}
= 
\begin{bmatrix}
Y_{11} & Y_{12} & \cdots & Y_{1j} \\
Y_{21} & Y_{22} & \cdots & \cdots \\
\vdots & \vdots & \ddots & \vdots \\
Y_{ii} & \cdots & \cdots & Y_{ii}
\end{bmatrix}
\begin{bmatrix}
E_1 \\
E_2 \\
\vdots \\
E_i
\end{bmatrix}
\quad (3)
\]

\[
P_1 = Y_{11}E_1 + Y_{12}E_2 + \cdots + Y_{1i}E_i
\]

\[
P_2 = Y_{21}E_1 + Y_{22}E_2 + \cdots + Y_{2i}E_i
\]

\[
P_i = Y_{ii}E_1 + Y_{12}E_2 + \cdots + Y_{i-1}(j-1)E_{i-1}
\]

The set of equations (4) may be solved for \( E_1 \) through \( E_i \) by the use of an iteration technique.

\[
E_2 = \frac{1}{Y_{22}} \left[ P_2 \right] \quad (5)
\]

Note that \( E_i \) has been written in terms of itself and the other voltages. Also, one equation has been deleted, this equation or bus is designated as the swing or slack bus. The swing or slack bus is generally a generator that is the first to react to changes in the system. Generally the voltage at the slack bus is specified.

**DECOMPOSITION METHOD**

The decomposition method for load-flow analysis was started by dividing the electrical power system into a number of smaller systems or subsystems. These subsystems can then be solved independently in parallel [13]. To demonstrate the decomposition method for load-flow analysis, an example will be presented.

The electrical power system of Figure 3 was decomposed by using the admittance matrix (Table 1). The admittance matrix was transformed so that each of the subsystems would appear sequentially; that is, the nodes were renumbered for the electrical circuit. This renumbering caused the network equations for each subsystem to follow the
dynamic equations for that same subsystem. Hence, for the power system of Figure 3
there were 4 different subsystems. In the admittance matrix, the diagonal elements $Y_{ii}$
are the summation of all admittances that surround the $i$th node, where $Y_{ii} = Y_{i1}$ and is
the negative of the branch. The admittance for Figure 3 is given in Table 1.

![Space Shuttle Power Distribution System](image)

**Figure 3:** Space Shuttle Power Distribution System.

### Table 1: The Admittance Matrix.

<table>
<thead>
<tr>
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>$Y_{1,1}$</td>
<td> </td>
<td> </td>
<td>$-Y_{1,4}$</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>2</td>
<td>$Y_{2,2}$</td>
<td>$Y_{3,3}$</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td>$-Y_{4,5}$</td>
<td>$-Y_{4,6}$</td>
<td>$-Y_{4,7}$</td>
<td> </td>
</tr>
<tr>
<td>3</td>
<td> </td>
<td>$Y_{5,5}$</td>
<td>$Y_{6,6}$</td>
<td> </td>
<td> </td>
<td>$Y_{7,7}$</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>4</td>
<td>$-Y_{4,1}$</td>
<td>$-Y_{4,2}$</td>
<td>$-Y_{4,3}$</td>
<td>$Y_{4,4}$</td>
<td> </td>
<td> </td>
<td>$-Y_{4,5}$</td>
<td>$-Y_{4,6}$</td>
<td>$-Y_{4,7}$</td>
<td> </td>
</tr>
<tr>
<td>5</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td>$Y_{5,5}$</td>
<td>$Y_{6,6}$</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>6</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td>$-Y_{5,4}$</td>
<td>$-Y_{6,4}$</td>
<td>$-Y_{7,4}$</td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>7</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td>$Y_{8,8}$</td>
<td>$Y_{9,9}$</td>
<td>$Y_{10,10}$</td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>8</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>9</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
<tr>
<td>10</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
</tr>
</tbody>
</table>

After the renumbering of Table 1 is completed the admittance matrix of Table 2 is
created. The admittance matrix from Table 2 can be expressed as the sum of two
matrices.

$$ [Y] = [Y_I] + [Y_c] $$

Where $[Y_I]$ are a number of block diagonal submatrices that represents a matrix for a
specific subsystem. From the $[Y_I]$ matrix, it can be seen that four different sets of
equations can be produced, one set for each subsystem as shown in Figure 4 and Table 3.
### Table 2: The Admittance Matrix Renumbered.

<table>
<thead>
<tr>
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Y_{1,1}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>Y_{2,2}</td>
<td>-Y_{1,4}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>-Y_{4,2}</td>
<td>Y_{3,3}</td>
<td>Y_{4,4}</td>
<td>-Y_{4,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>Y_{4,1}</td>
<td>-Y_{4,2}</td>
<td>-Y_{4,3}</td>
<td>Y_{4,4}</td>
<td>-Y_{4,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>-Y_{5,4}</td>
<td>Y_{5,5}</td>
<td>Y_{5,6}</td>
<td>Y_{5,7}</td>
<td>Y_{5,8}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>Y_{5,5}</td>
<td>-Y_{6,5}</td>
<td>Y_{6,6}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>-Y_{7,5}</td>
<td>Y_{7,7}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>Y_{8,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y_{9,9}</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y_{10,10}</td>
<td></td>
</tr>
</tbody>
</table>

### Table 3: The Admittance Matrix [Y_1].

<table>
<thead>
<tr>
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Y_{1,1}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>Y_{2,2}</td>
<td>-Y_{1,4}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>-Y_{4,2}</td>
<td>Y_{3,3}</td>
<td>Y_{4,4}</td>
<td>-Y_{4,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>Y_{4,1}</td>
<td>-Y_{4,2}</td>
<td>-Y_{4,3}</td>
<td>Y_{4,4}</td>
<td>-Y_{4,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>Y_{5,5}</td>
<td>-Y_{5,6}</td>
<td>Y_{5,7}</td>
<td>Y_{5,8}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>-Y_{6,5}</td>
<td>Y_{6,6}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>-Y_{7,5}</td>
<td>Y_{7,7}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>Y_{8,5}</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y_{9,9}</td>
<td></td>
</tr>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Y_{10,10}</td>
<td></td>
</tr>
</tbody>
</table>

Figure 4: Decomposed Space Shuttle Power Distribution System.
Each of the individual subsystems shown in Figure 4 and Table 3 can be solved independently. These subsystems require their own slack bus and one of these slack buses will serve as the slack bus for the entire system.

The remaining elements, \([Y_c]\) of the admittance matrix are the nonzero off diagonal elements that are the separating nodes of one subsystem with respect to the node voltages in another subsystem. Table 4 shows the \([Y_c]\) matrix.

<table>
<thead>
<tr>
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
<th>6</th>
<th>7</th>
<th>8</th>
<th>9</th>
<th>10</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td></td>
<td></td>
<td></td>
<td>-Y_{5,4}</td>
<td></td>
<td></td>
<td></td>
<td>Y_{4,5}</td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td></td>
<td>-Y_{9,4}</td>
<td></td>
<td></td>
<td></td>
<td>-Y_{10,4}</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

These nonzero off diagonal elements act as sending or receiving nodes for power flow from one subsystem to another. Therefore, these nonzero off diagonal elements can be referred to as voltage and power (EP) separating nodes. Finally, these nonzero off diagonal elements are directly related to the number of common nodes in different subsystems.

From the \([Y_c]\) matrix, it can be seen that at node 4 the system was divided. The \([Y_c]\) matrix is the coordination matrix and generally this matrix will contain the separating nodes. The function of the coordination matrix is to determine the amount of \(\Delta P\) and \(\Delta E\) that the separating nodes contribute to each subsystem.

**PERFORMANCE ANALYSIS**

To test the decomposition method for doing load-flow analysis, simulations were performed on power systems of 16, 25, 34, 43, 52, 61, 70, and 79 nodes. Each of the power systems was divided into subsystems and simulated under steady-state conditions. The results from these tests have been found to be as accurate as tests performed using a standard serial simulator. The division of the power systems into different subsystems was done by assigning a processor to each area. Table 5 provides the results of timing tests for each of the power systems. The times reported in Table 5 are the times associated with the solution of the load-flow problem (Node Time) and part of the host I/O (Host I/O). The Node Time includes internode communication time, calculation time, and part of the host I/O time. The Host I/O time reported in Table 5 is that part of the total host I/O time required to print the result of the load-flow calculation to the screen and the time to create two files, which contains the data of the timing tests that are reported in the following tables.
Table 5: Timing Data

<table>
<thead>
<tr>
<th>16 buses</th>
<th>25 buses</th>
<th>34 buses</th>
<th>43 buses</th>
</tr>
</thead>
<tbody>
<tr>
<td>Node Time ms</td>
<td>56.77</td>
<td>94.59</td>
<td>146.88</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>115.25</td>
<td>116.35</td>
<td>115.46</td>
</tr>
<tr>
<td>2 Node Time ms</td>
<td>33.41</td>
<td>44.29</td>
<td>51.97</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>112.45</td>
<td>123.20</td>
<td>122.88</td>
</tr>
<tr>
<td>4 Node Time ms</td>
<td>33.02</td>
<td>44.67</td>
<td>48.19</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>102.46</td>
<td>122.82</td>
<td>122.94</td>
</tr>
<tr>
<td>10 Node Time ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>13 Node Time ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
</tbody>
</table>

The results of the timing tests that are reported by running each program 10 times and examining the times recorded when the program finishes [3, 4, 11]. In order to like speedup and efficiency, the times reported for the single "best case" and the times reported for the multiprocessor. By reporting the "best case" times for the single processor the times for the multiprocessor programs would provide the and efficiency for the multiprocessor programs.

Again from Table 5 the Node Times did not change as these numbers are the actual times. The Host I/O numbers are next. Being consistent with the idea of the "worst case" the best "best case" time for the single processor program time for the multiprocessor programs. By using the best processor programs and the worst "worst case" time for the would provide the worst "worst case" for the speedup multiprocessor programs. This adjustment to Table 5 is given.

Table 6: Adjusted Time

<table>
<thead>
<tr>
<th>16 buses</th>
<th>25 buses</th>
<th>34 buses</th>
<th>43 buses</th>
</tr>
</thead>
<tbody>
<tr>
<td>Node Time ms</td>
<td>56.77</td>
<td>94.59</td>
<td>146.88</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>99.84</td>
<td>99.84</td>
<td>99.84</td>
</tr>
<tr>
<td>2 Node Time ms</td>
<td>33.41</td>
<td>44.29</td>
<td>51.97</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>128.26</td>
<td>128.26</td>
<td>128.26</td>
</tr>
<tr>
<td>4 Node Time ms</td>
<td>33.02</td>
<td>44.67</td>
<td>48.19</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>128.26</td>
<td>128.26</td>
<td>128.26</td>
</tr>
<tr>
<td>10 Node Time ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>13 Node Time ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>Transputer Host I/O ms</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
</tbody>
</table>

From Table 6, Table 7 was created. Table 7 is the total "best" processor program and the total "worst case" time for each multiprocessor.
Table 7: “Worst Case” Total Time

<table>
<thead>
<tr>
<th>Buses</th>
<th>16 buses</th>
<th>25 buses</th>
<th>34 buses</th>
<th>43 buses</th>
<th>52 buses</th>
<th>61 buses</th>
<th>70 buses</th>
<th>79 buses</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Node Time ms</td>
<td>Host I/O ms</td>
<td>Total Time ms</td>
<td>Node Time ms</td>
<td>Host I/O ms</td>
<td>Total Time ms</td>
<td>Node Time ms</td>
<td>Host I/O ms</td>
</tr>
<tr>
<td>1</td>
<td>56.77</td>
<td>99.84</td>
<td>161.61</td>
<td>44.29</td>
<td>91.14</td>
<td>135.43</td>
<td>0.99</td>
<td>99.84</td>
</tr>
<tr>
<td>2</td>
<td>128.26</td>
<td>128.26</td>
<td>256.52</td>
<td>44.42</td>
<td>99.84</td>
<td>144.26</td>
<td>0.99</td>
<td>99.84</td>
</tr>
<tr>
<td>4</td>
<td>128.26</td>
<td>128.26</td>
<td>256.52</td>
<td>44.42</td>
<td>99.84</td>
<td>144.26</td>
<td>0.99</td>
<td>99.84</td>
</tr>
<tr>
<td>10</td>
<td>128.26</td>
<td>128.26</td>
<td>256.52</td>
<td>44.42</td>
<td>99.84</td>
<td>144.26</td>
<td>0.99</td>
<td>99.84</td>
</tr>
<tr>
<td>13</td>
<td>128.26</td>
<td>128.26</td>
<td>256.52</td>
<td>44.42</td>
<td>99.84</td>
<td>144.26</td>
<td>0.99</td>
<td>99.84</td>
</tr>
</tbody>
</table>

Table 8: Speedup Results

<table>
<thead>
<tr>
<th>Buses</th>
<th>16 buses</th>
<th>25 buses</th>
<th>34 buses</th>
<th>43 buses</th>
<th>52 buses</th>
<th>61 buses</th>
<th>70 buses</th>
<th>79 buses</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
<td>Speedup</td>
</tr>
<tr>
<td>1</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
</tr>
<tr>
<td>2</td>
<td>0.96</td>
<td>1.12</td>
<td>1.36</td>
<td>1.55</td>
<td>1.57</td>
<td>1.60</td>
<td>1.65</td>
<td>1.70</td>
</tr>
<tr>
<td>4</td>
<td>0.97</td>
<td>1.12</td>
<td>1.39</td>
<td>1.70</td>
<td>1.78</td>
<td>1.86</td>
<td>1.95</td>
<td>2.04</td>
</tr>
<tr>
<td>10</td>
<td>1.80</td>
<td>1.92</td>
<td>2.02</td>
<td>2.11</td>
<td>2.23</td>
<td>2.31</td>
<td>2.41</td>
<td>2.56</td>
</tr>
<tr>
<td>13</td>
<td>154.18</td>
<td>154.82</td>
<td>155.14</td>
<td>155.20</td>
<td>155.08</td>
<td>155.20</td>
<td>155.08</td>
<td>155.08</td>
</tr>
</tbody>
</table>

Speedup is a measure of the application software utilization of a multiple processor system. However, the original purpose of the speedup was to compare the time of the fastest serial program, T1, with the time of the parallel equivalent of the same program, T(p). [15] This definition of speedup has a different meaning, e.g.,

\[ S(p) = \frac{T_1}{T(p)} \]  

(7)

The definition of the speedup from equation 7 is rarely used because of the difficulty in measuring T1. Instead, T(1) is used as an approximation of T1. Therefore, the speedup is estimated from the measurement for T(1) and T(N) [15] and is defined as follows:

\[ S(p) = \frac{T(1)}{T(p)} \]  

(8)

The speedup of the different multiprocessor programs is given in Table 8. Again these are the “worst case” values for the speedup.
Processor efficiency measures the contribution of each processor to the parallel solution when \( p \) processors are employed. That is, \( E(p) \) equals the average efficiency per processor when the problem is run with \( p \) parallel processors [15].

\[
E(p) = \frac{S(p)}{p} \tag{9}
\]

Therefore an efficiency rating of 100% means the program runs in linear speedup time, while an efficiency of 0% means the processor is of no use in solving the problem in parallel [15].

The efficiency of the different multiprocessor programs is given in Table 9. Again these are the “worst case” values for the efficiency.

### Table 9: Final Speedup And Efficiency Results

<table>
<thead>
<tr>
<th></th>
<th>16 buses</th>
<th>25 buses</th>
<th>34 buses</th>
<th>43 buses</th>
<th>52 buses</th>
<th>61 buses</th>
<th>70 buses</th>
<th>79 buses</th>
</tr>
</thead>
<tbody>
<tr>
<td>Transputer Speedup</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Efficiency</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
<td>NA</td>
</tr>
<tr>
<td>Transputer Speedup</td>
<td>0.97</td>
<td>1.12</td>
<td>1.36</td>
<td>1.55</td>
<td>1.57</td>
<td>1.60</td>
<td>1.65</td>
<td>1.70</td>
</tr>
<tr>
<td>Transputer Efficiency</td>
<td>48.0%</td>
<td>56.0%</td>
<td>68.0%</td>
<td>77.5%</td>
<td>78.5%</td>
<td>80.0%</td>
<td>82.5%</td>
<td>85.0%</td>
</tr>
<tr>
<td>Transputer Speedup</td>
<td>0.97</td>
<td>1.12</td>
<td>1.39</td>
<td>1.70</td>
<td>1.78</td>
<td>1.86</td>
<td>1.95</td>
<td>2.04</td>
</tr>
<tr>
<td>Transputer Efficiency</td>
<td>24.2%</td>
<td>28.0%</td>
<td>34.7%</td>
<td>42.5%</td>
<td>44.5%</td>
<td>46.5%</td>
<td>48.7%</td>
<td>51.0%</td>
</tr>
<tr>
<td>Transputer Speedup</td>
<td>*</td>
<td>*</td>
<td>*</td>
<td>1.80</td>
<td>1.92</td>
<td>2.02</td>
<td>2.11</td>
<td>2.23</td>
</tr>
<tr>
<td>Transputer Efficiency</td>
<td>*</td>
<td>*</td>
<td>*</td>
<td>18.0%</td>
<td>19.2%</td>
<td>20.2%</td>
<td>21.1%</td>
<td>22.3%</td>
</tr>
<tr>
<td>Transputer Speedup</td>
<td>*</td>
<td>*</td>
<td>*</td>
<td>2.02</td>
<td>2.14</td>
<td>2.27</td>
<td>2.41</td>
<td>2.56</td>
</tr>
<tr>
<td>Transputer Efficiency</td>
<td>*</td>
<td>*</td>
<td>*</td>
<td>15.5%</td>
<td>16.4%</td>
<td>17.4%</td>
<td>18.5%</td>
<td>19.6%</td>
</tr>
</tbody>
</table>

### FAULT-TOLERANCE

Fault tolerance is a class of methods to achieve dependable and reliable computing systems. Dependability is that property of a computer system that allows reliance to be justifiably placed on the service a system delivers [12, 14]. Fault tolerance is the ability to deliver a service in spite of faults. Fault tolerance is generally obtained from redundancy. This redundancy can be accomplished by masking (static) redundancy or dynamic redundancy [12, 14].

Masking (static) redundancy uses extra components to obscure, or mask, the effect of a faulty component [12, 14]. This technique is regarded as static because once the redundant copies are connected their interconnections remain fixed. The error resulting from faulty components are masked by the presence of other copies of those components. An example of masking redundancy is the computer system for the Space Shuttle. There are four different computers on the Space Shuttle. Three of these computers execute the same program using the same data while the fourth computer runs a different program which performs the same function as the other three. The results of the three computers running the same program are compared or voted on, if any one of the result from one of the computers is inconsistent with that of the others, it is masked out.

The reason that the fourth computer has a different set of software that performs the same function as the other three is to grade against an error in programming. This is a form of dynamic redundancy. Dynamic redundancy refers to systems which can be reconfigured in response to a fault. If the fourth computer results were different from the results of the other three computers on the Space Shuttle, the fourth computer would take
over and mask out the others, thereby reconfiguring the computer system from a multiprocessor to a single processor computer system.

Another fault tolerant technique is fault detection, but fault detection provides no tolerance to faults, rather it gives warning when they occur [12, 14].

The reason for this introduction to fault tolerant system was to show how they could be applied to this research. The goal of this research is not to design a fault tolerant computer system but to apply parallel and distributed processing to construct an electric power management system (EPMS) for space based power systems. In the development of this EPMS, two different architectures for doing the decomposition method for load-flow were tested. These results have been presented in Tables 5 through 9. They were the 10 and 13 transputer systems and are presented in Figures 5 and 6.

From Figures 5 and 6 it can be seen that processor 0 is the critical processor. If processor 0 were to fail, both systems in Figure 5 and 6 would fail. However if processor 1, 2, or 3 were to fail in Figure 6, the processors which feed into any of these processors would be removed from service. An alternative system which is given in Figure 7 would use 12 processors. The system in Figure 7 would operate the same as the system in Figure 5 with the exception that processors 10 and 11 would perform the same function as processor 0. This would provide a static redundancy for processor 0 and would eliminate the risk of a fault on the middle layer of processors in Figure 6. The system in
Figure 7 was not tested so no timing data are available at this time, but the point is made that fault tolerance can be addressed in this design.

**CONCLUSIONS**

This paper has presented preliminary results for a load-flow analysis using a decomposition principal. It has been shown that the decomposition algorithm for load-flow analysis is well suited for parallel processing and provides increases in the speed of execution. Also two different hardware structures for doing the decomposition method were presented (Figures 5 and 6) and they were analyzed for speedup and efficiency. A fault tolerant design was also proposed (Figure 7) which made use of static redundancy.

Further work will be conducted on the decomposition principal by the addition of more transputers to the present system. This will be done to see whether or not further decomposition by the addition of more processors will be of any benefit. Also with the addition of more transputers, a hyper cube architecture could also be investigated; this approach was presented by Taoka, Iyoda, Noguchi, Sato, and Nakazawa [10].

**REFERENCES**


