General Disclaimer

One or more of the Following Statements may affect this Document

- This document has been reproduced from the best copy furnished by the organizational source. It is being released in the interest of making available as much information as possible.

- This document may contain data, which exceeds the sheet parameters. It was furnished in this condition by the organizational source and is the best copy available.

- This document may contain tone-on-tone or color graphs, charts and/or pictures, which have been reproduced in black and white.

- This document is paginated as submitted by the original source.

- Portions of this document are not fully legible due to the historical nature of some of the material. However, it is the best reproduction available from the original submission.

Produced by the NASA Center for Aerospace Information (CASI)
Final Report

CMOS ARRAY DESIGN AUTOMATION TECHNIQUES

by

P. Ramondetta, A. Feller, R. Noto and T. Lombardi

Contract NAS12-2233 Mods 6 and 11

May 1975

Prepared for

GEORGE C. MARSHALL SPACE FLIGHT CENTER
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION
MARSHALL SPACE FLIGHT CENTER
ALABAMA 35812

Prepared by

ADVANCED TECHNOLOGY LABORATORIES
GOVERNMENT AND COMMERCIAL SYSTEMS
RCA
CAMDEN, NEW JERSEY 08102
FOREWORD

This report covers the work accomplished on the NASA Contract NAS12-2233 Modifications 6 and 11 under the direction of John Gould, Technical Program Director, to develop and/or enhance a design automation capability for generating low cost, quick turnaround custom LSI arrays using the standard cell approach with the silicon-gate bulk CMOS and SOS technologies.

Section 1 briefly details the work leading up to that described in this report. Sections 2 and 3 describe the work relating to the CMOS/SOS technologies, which were the principal focus of this program. The silicon-gate, beam-lead bulk CMOS portion of this program is described in the Appendix.
ACKNOWLEDGEMENT

The authors wish to acknowledge the following other members of RCA Advanced Technology Laboratories in Camden, N.J., who teamed with them to conceive and successfully implement this program: Mr. A. Smith -- co-author of the basic array topology and principal cell designer -- who directed the design of all LSI arrays; Mr. R. Pryor, who performed cell design and analysis; Mr. R. Lisowski, who assisted in the testing and evaluation of the arrays; and especially Mr. F. Bertino, who together with Mr. J. DeLuca, was responsible for translating, through the use of the standard cell design automation computer program, the input logic design into a form for plotting the final mask artwork in addition to implementing all the tasks involved in developing a working cell library tape from the original design.

The authors also gratefully acknowledge the contributions of the following members of RCA Solid State Technology Center in Somerville, N.J.: Messrs. H. Borkan, D. Woo and S. Policastro, who developed the technology and processed the CMOS/SOS LSI arrays; Messrs. S. Cohen and G. Caswell, who provided diagnostic and general assistance; Messrs. T. Mayhew, A. Woodhull, and K. Long, who tested and packaged the LSI arrays and R. Geshner and B. Greening who provided the working masks.

The authors also would like to acknowledge the contribution of G. Lines and his activity who were very responsive, cooperative and efficient in providing 80X precision artwork.
TABLE OF CONTENTS

<table>
<thead>
<tr>
<th>Section</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1 INTRODUCTION</td>
<td>1</td>
</tr>
<tr>
<td>2 WORK REQUIREMENTS &amp; SUMMARY OF ACCOMPLISHMENTS</td>
<td>4</td>
</tr>
<tr>
<td>3 TECHNICAL DESCRIPTION</td>
<td>6</td>
</tr>
<tr>
<td>A. CMOS/SOS PR2D Automatic Placement and Routing Program</td>
<td>6</td>
</tr>
<tr>
<td>B. Silicon-Gate CMOS/SOS Technology</td>
<td>7</td>
</tr>
<tr>
<td>C. Analysis and Basic Cell Design</td>
<td>11</td>
</tr>
<tr>
<td>D. CMOS/SOS Standard Cell Design Layout</td>
<td>22</td>
</tr>
<tr>
<td>E. CMOS/SOS Standard Cell Array Topology</td>
<td>25</td>
</tr>
<tr>
<td>F. CMOS/SOS Standard Cell Library</td>
<td>31</td>
</tr>
<tr>
<td>G. CMOS/SOS LSI Chip Measurement and Analysis</td>
<td>34</td>
</tr>
<tr>
<td>H. Conclusions and Results</td>
<td>54</td>
</tr>
</tbody>
</table>

APPENDIX ............................................. A-1

LIST OF ILLUSTRATIONS

<table>
<thead>
<tr>
<th>Figure</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>2</td>
<td>12</td>
</tr>
<tr>
<td>3</td>
<td>13</td>
</tr>
<tr>
<td>4</td>
<td>15</td>
</tr>
<tr>
<td>5</td>
<td>15</td>
</tr>
<tr>
<td>6</td>
<td>16</td>
</tr>
<tr>
<td>7</td>
<td>18</td>
</tr>
<tr>
<td>8</td>
<td>19</td>
</tr>
<tr>
<td>9</td>
<td>21</td>
</tr>
<tr>
<td>10</td>
<td>23</td>
</tr>
<tr>
<td>11</td>
<td>26</td>
</tr>
<tr>
<td>12</td>
<td>27</td>
</tr>
</tbody>
</table>
LIST OF ILLUSTRATIONS (Cont'd.)

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>13</td>
<td>Composite and levels of two-input AND</td>
<td>28</td>
</tr>
<tr>
<td>14</td>
<td>Typical metalization level for SOS standard cell array</td>
<td>30</td>
</tr>
<tr>
<td>15</td>
<td>Structure of gate protection device</td>
<td>31</td>
</tr>
<tr>
<td>16</td>
<td>CMOS/SOS standard cell test chip</td>
<td>35</td>
</tr>
<tr>
<td>17</td>
<td>Typical drain characteristics</td>
<td>37</td>
</tr>
<tr>
<td>18</td>
<td>Two-input NOR stage delay test circuits</td>
<td>39</td>
</tr>
<tr>
<td>19</td>
<td>Measured two-input NOR chain delay</td>
<td>40</td>
</tr>
<tr>
<td>20</td>
<td>Measured two-input NOR stage delay</td>
<td>41</td>
</tr>
<tr>
<td>21</td>
<td>8-Bit counter delay measurements</td>
<td>42</td>
</tr>
<tr>
<td>22</td>
<td>Measured delay for floating point multiplexer</td>
<td>44</td>
</tr>
<tr>
<td>23</td>
<td>Calculated delays for up/down counter</td>
<td>45</td>
</tr>
<tr>
<td>24</td>
<td>Critical path of 8-bit multiplexed input adder</td>
<td>47</td>
</tr>
<tr>
<td>25</td>
<td>Typical input-output waveforms of 8-bit adder</td>
<td>48</td>
</tr>
<tr>
<td>26</td>
<td>Calculated delays for 9-bit 4 x 2 multiplexer</td>
<td>49</td>
</tr>
<tr>
<td>27</td>
<td>SUMC-CVT CMOS/SOS adder control chip measurement path</td>
<td>51</td>
</tr>
<tr>
<td>28</td>
<td>Typical input-output waveforms of adder control</td>
<td>52</td>
</tr>
</tbody>
</table>

LIST OF TABLES

<table>
<thead>
<tr>
<th>Table</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Comparison of Average Measured Delay and Computer-Predicted Delay</td>
<td>5</td>
</tr>
<tr>
<td>2</td>
<td>Technical Analysis Parameters</td>
<td>11</td>
</tr>
<tr>
<td>3</td>
<td>Standard Cell Height Components</td>
<td>17</td>
</tr>
<tr>
<td>4</td>
<td>SOS Test Chip (011) Device Parameters</td>
<td>20</td>
</tr>
<tr>
<td>5</td>
<td>Documented CMOS/SOS Standard Cell Family</td>
<td>33</td>
</tr>
<tr>
<td>6</td>
<td>Measured Test Transistor Characteristics</td>
<td>36</td>
</tr>
<tr>
<td>7</td>
<td>SUMC-CVT CMOS/SOS LSI Array Performance Measurements</td>
<td>53</td>
</tr>
</tbody>
</table>
Section 1

INTRODUCTION

In the 1966-1968 time frame, RCA, supported by contract, developed and successfully implemented a standard cell design automation (DA) approach for generating low cost, quick-turnaround custom LSI arrays. 1 The computer programs, as well as the circuit design and layout, were customized for the dynamic, two-phase, high threshold PMOS ratio logic circuitry. This DA approach has since proven itself in the production of hundreds of custom LSI arrays by different contractors on a number of Government programs.

In 1969, with contract support from NASA-ERC, RCA began a program to extend the standard cell DA capability to the CMOS technology. Because CMOS circuitry is a static technology, this program required a completely different approach than taken with PMOS for the circuit design and layout as well as the basic algorithms of the computer programs. This CMOS standard cell DA development was successfully demonstrated in measurements on a CMOS standard cell test chip specifically designed to evaluate the program results. 2

To demonstrate the effectiveness of the CMOS DA capability in reducing the cost and design times of systems using these LSI arrays, as well as in providing enhanced performance, RCA designed and built a 16-bit computer -- the SUMC-DV -- with contract support from NASA-MSFC, Huntsville, Alabama. The SUMC-DV uses 10 different LSI array types in a total LSI usage of 55 parts. The entire system was successfully designed and built in one year, including the design, fabrication and testing of the 10 LSI array types using the CMOS standard cell DA capability. 3

Following the successful implementation of the SUMC-DV, several application areas for design automation LSI were identified. One of these was a requirement for a high performance, low power, sophisticated fault-tolerant multiprocessor system using 32-bit CPUs. Because of the performance and throughput requirements in this computer development, silicon-gate, beam-lead bulk CMOS technology was selected for its high speed, high packing density, and low power dissipation potential. Support for the development of the DA capability for this technology was received in one of the earlier modifications (mod 6) to Contract NAS12-2233 (described in the Appendix to this report).

As a result, a family of silicon-gate, beam-lead CMOS bulk silicon standard cells was developed. The automatic placement and routing program was modified to be compatible with the polysilicon-gate, beam-lead bulk silicon process. A functional test chip was designed and tested with only moderate success in achieving the high performance required and anticipated. Because of this limited success with bulk CMOS and because of substantial advances in establishing a mature and stable CMOS/SOS pilot line*, the technology for the multiprocessor was switched to self-aligned, silicon-gate CMOS/SOS to take advantage of its higher performance, high density and lower power capabilities as well as to capitalize on the CMOS/SOS process development in RCA.

A further modification (mod 11) to Contract NAS12-2233 enabled RCA to develop the standard cell LSI design automation capability for the silicon-gate CMOS/SOS technology (described in Sections 2 and 3 of this report). Work was performed and results obtained in the following areas:

- Improvements and modifications to the automatic placement and routing program for the CMOS/SOS technologies
- Analysis and developments relating to basic cell design and layout
- Description of CMOS/SOS standard cell design and layout
- Description of the CMOS/SOS standard cell chip layout
- Functional description of the cells in the NASA CMOS/SOS standard cell family
- Description of cell data sheets
- Description of the standard cell test chip

* RCA Solid State Division had established separate facilities for special purpose, large volume CMOS/SOS applications.
• Documentation, evaluation and interpretation of the performance and characteristic data measurements made on the test chip

• Documentation, evaluation and interpretation of the static and dynamic performance measurements which were made on an Independent Research and Development program on five different SUMC-CVT CMOS/SOS chip types designed on Contract NASA-29072 with the standard cell approach.
Section 2

WORK REQUIREMENTS AND SUMMARY OF ACCOMPLISHMENTS

A low cost, quick turnaround technique for generating custom CMOS/SOS LSI arrays using the standard cell approach was developed, implemented, tested and validated. This was, in essence, the objective of this program. To achieve this result, a series of intermediate objectives and goals had to be, and were, accomplished. These accomplishments and results include the following:

(1) The Automatic Placement and Routing Computer program was modified and enhanced to ensure compatibility with, as well as to optimize the performance of, the self-aligned silicon-gate CMOS/SOS technology.

(2) A basic cell design topology and guidelines were defined based on an extensive analysis that included circuit, layout, process, array topology and required performance considerations -- particularly high circuit speed. A standard cell height of 7 mils and a minimum pad spacing of 1 mil were established. In addition to meeting the principal design consideration of speed, the cell area of CMOS/SOS was dramatically reduced compared to that for CMOS bulk standard cell design. For example, a two-input NOR requires 79.8 square mils in the metal-gate standard cell family and only 21 square mils in the CMOS/SOS standard cell family with virtually the same size devices -- a reduction of almost four to one.

(3) A family of 11 self-aligned, silicon-gate CMOS-SOS standard cell circuits was developed on the program. Additional cells were added to the family as a result of work done on NASA contract NAS-29072 and RCA Independent Research and Development programs. For each of 11 logic cell types developed on this program a user oriented data sheet has been generated. The data sheet contains the circuit design, logic configuration, input and output capacitance, and dynamic performance design information. The performance of each of these cells was validated initially by computer simulation techniques and later by direct performance measurements on LSI arrays designed for the NASA CMOS/SOS SUMC computer. Similar data were generated for other cells designed on the other programs. These cells, as well as the 11 designed for this program, were incorporated into a separate, convenient, expandable standard cell notebook.

(4) The silicon-gate CMOS/SOS test chip was designed not only to provide experimental validation that the standard cells functioned properly but also, more critically, to determine that their dynamic performance correlated with the predicted delays based on computer simulation. This performance validation was verified. For example, the average stage delay for a two-input NOR circuit in a serial logic chain containing eight levels of logic was less than 1.6 ns for the devices with 0.25-mil channel lengths and approximately 2.4 ns for devices with a 0.3-mil channel length. In addition, the test chip provided device and characterization data which was used to update the values of the device model parameters used in the computer simulation program. By such means the accuracy of the speed predictions based on computer simulation techniques is increased,
(5) Since some of the cells developed had not been designed when the test chip was laid out and fabricated, they do not appear on the test chip. These cells were experimentally validated by the measured data taken on five CMOS/SOS custom LSI arrays designed for the SUMC-CVT computer system. These custom LSI arrays varied in complexity from a 150-gate $4 \times 2$ multiplexer array to a 450-gate, multiplexed, 8-bit adder array.

(6) The correlation between the average measured delay and the delay predicted by the computer simulation program was excellent -- well within design tolerances. For example, the difference between the average measured delay and computer-predicted delay for specially selected logic paths on each of the five chips can be seen from Table 1.

**TABLE 1. COMPARISON OF AVERAGE MEASURED DELAY AND COMPUTER-PREDICTED DELAY**

<table>
<thead>
<tr>
<th>Custom Standard Cell CMOS/SOS Array Types</th>
<th>Computer Predicted Delay (ns)</th>
<th>Average Measured Delay (ns)</th>
<th>Measured Average Stage Delay (ns)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Floating Point Multiplexer</td>
<td>27</td>
<td>23</td>
<td>4.3</td>
</tr>
<tr>
<td>Up/Down Counter</td>
<td>46</td>
<td>52</td>
<td>5.8</td>
</tr>
<tr>
<td>8-Bit Adder with Carry Prediction</td>
<td>75</td>
<td>73</td>
<td>6.0</td>
</tr>
<tr>
<td>9-Bit $4 \times 2$ Multiplexer</td>
<td>103</td>
<td>105</td>
<td></td>
</tr>
<tr>
<td>Adder-Multiplexer Control</td>
<td>50</td>
<td>60</td>
<td>8.0</td>
</tr>
<tr>
<td></td>
<td>66</td>
<td>67</td>
<td></td>
</tr>
<tr>
<td></td>
<td>71</td>
<td>76</td>
<td>5.2</td>
</tr>
</tbody>
</table>

As seen in the table, there is generally good correlation between the predicted and measured results. Differences fall within design tolerance. The major significance of the correlation between predicted and measured results is that the circuit speeds achieve the dynamic performance objectives for which they were designed -- the NASA SUMC-CVT computer system program.
Section 3

TECHNICAL DESCRIPTION

A. CMOS/SOS PR2D AUTOMATIC PLACEMENT AND ROUTING PROGRAM

The changes, modifications and improvements introduced into the PR2D automatic placement and routing program to obtain compatibility with and optimization of the CMOS/SOS technology are as follows:

(1) Power Distribution, Chip Criterion. The power distribution routine is similar to the double-sided metal ground in the bulk CMOS version of the PR2D program. Each cell row has VDD and ground routed through the cell row and connected on the side to the peripheral power. The VDD is routed at the top of the cell row (midway between back-to-back cell rows) and routed on side row number 1 to the peripheral power. The ground is routed at the bottom of the cell row (both top and bottom of back-to-back cell rows), joined at the right side of the cell rows, and routed to the peripheral power on side row number 2. For odd cell rows, the ground is routed at the top of the cell row and via side row number 1 to the peripheral power. The VDD is routed at the bottom of the cell row and via side row number 2 to the peripheral power.

(2) Power Distribution, Peripheral. The SOS option for the peripheral power is an interrelated set of two nonintersecting open double "0" patterns. The left side of the chip (row number 1) has the power closer to the chip interior. The right side of the chip (row number 2) has the VDD closer to the chip interior. This configuration allows VDD and ground to be all-metal lines from the power bonding pads to the cell within the chip. This peripheral power distribution is required in order to properly distribute the ground and VDD to the cell rows.

(3) Power Bonding Pad Locations. In this modification the PR2D program places the ground power bonding pad in the lowest position on row number 1 in the lower left-hand corner of the chip, and the VDD power bonding pad in the highest position on row number 2 in the upper right-hand corner of the chip. These locations are required because of the peripheral power distribution.

(4) Zero Channel Routing. Because of the SOS standard cell design, zero channel routing is permitted. The SOS version of the PR2D program has been modified to allow zero channel routing. This feature will decrease the number of horizontal channels required by one per cell row.
(5) Zero Channel Tunnel Ends. To provide compatibility with the standard cell
design, zero level tunnel ends are required when the routing to the pin of a cell is
metal. This tunnel end will connect the metal routing to the tunnel level for connectivity
to the cell interior.

(6) Chip Borders. This SOS option of the PR2D program will generate the proper
border requirements on all mask levels for the SOS technology.

(7) Low Profile Metal. The low profile metal option has been modified for the SOS
technology. The modification will not allow lower profile of metal in the same vertical
channel which contains another node. This feature, while still allowing low profile
metal, will not increase the node crossover. Further, the low profile metal option has
been further debugged. In addition, a new feature -- removal of unneeded low profile
metal -- has been added. This will reduce the number of artwork instructions generated.

(8) New Features. One of the new features added in the SOS modification of the
PR2D program is the modified class 2 routing from the odd cell to the top bonding pads.
Normally this would be a class 3 route. When this option is exercised, the program will
generate class 2 routing instead of class 3. This feature improves routing by not routing
these nodes on the side surfaces. However, it is required that all such routes be to the
center bonding pads of the top pads; that is, no top bonding pad which does not go to the
odd cell row can be imbedded within those bonding pads which do go to the odd cell row.

(9) General. Additional debugging of basic program has been done. This should
improve even further the quality of the chip design.

B. SILICON-GATE CMOS/SOS TECHNOLOGY

Although the CMOS/SOS standard cell library was designed and laid out for the
double-epitaxial, silicon-gate SOS process, SOS standard cell arrays may be fabricated
with either the double-epitaxial or the single-epitaxial technology. The mask set for
any of the processes is extracted from a common set of masks.

The silicon-gate, single-epitaxial processes actually require that the first two photo-
masks, used in the double-epitaxial process, be superimposed to form a new photomask.
The new mask is the logical "or" of the first two. This is routinely done in the photomask
shop whenever the single-epitaxial process is elected. All other photomasks are identical
for the two processes. The similar nature of the single- and double-epitaxial fabrication
techniques along with their pilot line status virtually guarantees the continuance of identical
design/layout rules for the two technologies. This, in turn, guarantees that the photomasks
for the double-epitaxial silicon-gate standard cell family will continue to be compatible with
the single-epitaxial processes. Characteristics common to both processes are described
in the following paragraphs.
1. Insulating Substrate

The combination of isolated thin film islands and an insulating sapphire substrate is the principal reason that this technology delivers a maximum level of performance, and yet provides a high gate density on an array at an extremely low power level.

2. Guard Band Elimination

Because this technology provides complete isolation between devices when required (the sapphire substrate eliminates the possibility of field inversion), anti-field inversion and anti-parasitic device techniques, such as P+ and N+ diffused guard bands, field shield and counter diffusion, are not required. The elimination of diffused guard bands allows a significant reduction in the area of circuit layouts with no performance penalties. Layouts show a 3-to-1 to 4-to-1 area reduction for standard cells using this technology compared to the cell area using the standard CMOS metal-gate, bulk silicon technology.

3. Minimum Capacitance Technology

The self-aligned, silicon-gate CMOS/SOS technology virtually eliminates all junction and interconnection capacitances. The only junction capacitance which exists in this process is associated with the sidewalls of the diffused junction in the thin film islands. And except for the input capacitance, which depends on transistor width and is near minimum because of the polygate-selfalignment technique, the only other significant capacitance is the coupling or crossover capacitance between the metal and polysilicon interconnect routing.

The principal result of this low inherent capacitance is to enhance on-chip circuit performance.

Alternately, speed can be traded for density when minimum size devices are used. These devices, constrained principally by the topological design rules, can be used for circuit design with little loss in performance.

4. Maximum Speed/Area Ratio

The propagation delay of this technology in an LSI environment doesn’t vary extensively with device size. Therefore, as is common with other MOS technologies, speed and performance are not seriously sacrificed by reducing the device size and the corresponding area. The virtual elimination of the junction and substrate capacitance is the principal reason that the CMOS/SOS technology has such a high speed/area ratio.
5. Elimination of Substrate Effect

Because the sapphire substrate is essentially an insulator, the increase in threshold voltage as the source-to-substrate reverse bias is increased, which most MOS technologies experience, is eliminated in this technology. Therefore, even when devices are stacked, the size of the devices need not be increased because of the source-to-substrate effect.

6. Simplicity of Process and Design Procedure

Although the silicon-gate CMOS/SOS process has many unusual features, it requires only six* process photomask steps plus a passivation mask. This is identical in number to the standard CMOS metal-gate technology. Thus, among the newer high performance, high density technologies, the silicon-gate CMOS/SOS has an excellent potential for maturing into a low cost, high volume technology. RCA has delivered a large volume of a CMOS/SOS LSI array to an industrial customer over the past year, and has announced several CMOS/SOS LSI products and will be offering at least one each month in 1975.

In terms of design procedures and layout rules, elimination of the guard bands makes this technology easier to use.

A cross-section of the silicon-gate, double-epitaxial topology appears in Fig. 1. The numbers in parentheses indicate photomask numbers. The levels are:

- Level 1: P-Epitaxial Island Definition
- Level 2: N-Epitaxial Island Definition
- Level 3: Polysilicon Gate and Interconnect Definition
- Level 4: N-Type Diffusion Definition
- Level 5: Contact Hole Definition
- Level 6: Metal Interconnect Definition
- Level 7: Passivation Mask Opening Definition

* The silicon-gate, double-epitaxial SOS process requires six process photomasks; the silicon-gate, deep-depletion SOS process requires five process photomasks.
Fig. 1. Silicon-gate CMOS on sapphire process, double epitaxy.
C. ANALYSIS AND BASIC CELL DESIGN

1. Technical Analysis

This section discusses the analysis involved in defining the basic device geometry and the standard cell height for the selected double epitaxial, P+ poly, CMOS/SOS process.

Initial considerations center on capturing a nominal set of processing parameters and design rules for the process. The parameters used in the analysis are listed in Table 2. The saturation currents, for 10.0-V operation, resulting from these parameters are 2.0 mA/mil and 1.2 mA/mil for the N- and P-type transistors, respectively. Although the N- to P-transistor current ratio is 1.67 (2.0/1.2), the actual transistor geometry design ratio was fixed at 1.8:1, which based on past experience is considered a conservative estimate.

TABLE 2. TECHNICAL ANALYSIS PARAMETERS

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>( V_{TN} )</td>
<td>2.3 V</td>
</tr>
<tr>
<td>( N_A )</td>
<td>( 1 \times 10^{15} / \text{cm}^3 )</td>
</tr>
<tr>
<td>( \mu_n )</td>
<td>42 ( \mu \text{cm}^2 / \text{volt-seconds} )</td>
</tr>
<tr>
<td>( L_{\text{eff}} )</td>
<td>0.20 mil (with 0.30-mil mask)</td>
</tr>
<tr>
<td>( E_{OX} )</td>
<td>3.9</td>
</tr>
<tr>
<td>( E_S )</td>
<td>11.7</td>
</tr>
<tr>
<td>( T_{OX} )</td>
<td>1100 Å (gate)</td>
</tr>
<tr>
<td>( T_{OX}' )</td>
<td>6000 Å (between poly and metal)</td>
</tr>
<tr>
<td>( V_{TP} )</td>
<td>-2.1 V</td>
</tr>
<tr>
<td>( N_D )</td>
<td>( 3 \times 10^{15} / \text{cm}^3 )</td>
</tr>
<tr>
<td>( \mu_p )</td>
<td>251 ( \mu \text{cm}^2 / \text{volt-seconds} )</td>
</tr>
</tbody>
</table>
To determine the on-chip circuit speeds of a generalized combinatorial LSI array using this process, a chain of two-input NOR circuits was analyzed. Each two-input NOR cell was loaded with a NOR circuit and a large inverter. The N- and P-transistors comprising the inverter were chosen to be large enough that the total load on each NOR circuit would be the approximate equivalent of 3.6 two-input NOR circuits. A fanout of 3.6 for the NOR circuits is a reasonable conservative estimate of the average fanout of a generalized logic array. One result of selecting this value for the fanout is that the circuit speeds will be lower than would be expected had some special purpose functional logic array with lower fanout been selected. However, a fanout in excess of three is considered typical in general purpose digital systems where extensive combinatorial and sequential logic are used in a nonregular way.

Care was taken to consider those features of the current DA system that would influence effective on-chip circuit speeds. Accordingly, the intercell resistance associated with the polysilicon interconnects was included in the evaluation circuit. Similarly, the crossover capacitance introduced by the metal/polysilicon crossovers in the intercell connection area was included. Figure 2 shows the generalized evaluation circuit with the intercell resistance and crossover capacitance represented as lumped parameters.

Fig. 2. Evaluation circuit for CMOS/SOS standard cell.
Although the length of the polysilicon interconnect resistance between two cells can range from 0 mil to perhaps 100 mils, past experience with more than 50 custom LSI arrays indicates that 20 mils is a fairly representative length. (Present plans call for polysilicon runs to be 0.80 mil wide.) With resistivity of 70 ohms/square, the 20 mils of polysilicon interconnect converts to a 1.8-kilohm resistance. For completeness, the effects of 46-mil (4 kΩ) and 6-mil (0.5 kΩ) polysilicon interconnects were also investigated.

The parasitic capacitance introduced by each metal/polysilicon crossover in the intercell connection area is 0.01 pF of either coupling or additional loading capacitance. Examination of past LSI array designs indicates that a given connection between two cells may have no metal/polysilicon crossovers or hundreds of them. Although the distribution is quite wide, an average of 30 crossovers per cell output was assumed. Further statistical analyses of current custom LSI arrays showed that 40 is a more accurate estimate of the average number of interconnections. The design will be modified to include these new statistics. The effects of the crossovers were considered in two ways; first, as an additional lumped capacitance load (as shown in Fig. 2), and second as a means of injecting crosstalk or noise into a lightly loaded circuit (as shown in Fig. 3).

![Fig. 3. Evaluation circuit for crosstalk due to metal/polysilicon overlap.](image)
With the parasitic resistive and capacitive models defined, several computer simulation analyses were made. Considering interconnection resistance alone, a separate run was made for four versions of the circuits shown in Fig. 2. For each case, the transistor geometries (specifically channel widths) were adjusted to be $1/3 \times$, $1/2 \times$, $1 \times$, and $2 \times$ those shown in Fig. 2. As the transistor sizes were modified, the parasitic interconnect resistance was kept constant. This was repeated for resistance values of $0.5 \, \text{k}\Omega$, $1.8 \, \text{k}\Omega$, and $4.0 \, \text{k}\Omega$. The results, shown in Fig. 4, clearly indicate that for the case of no interconnect capacitance, increased transistor geometries deteriorate circuit speed. This is shown analytically in Fig. 5 which shows the load charging time constants for various device widths. $R_T$ and $C_T$ are the intrinsic output resistance and input capacitance of the silicon-gate CMOS/SOS devices. If the effect of the coupling capacitance between crossing signal lines on the array is assumed to be negligible, then $C_T$ is essentially the only effective on-chip capacitance. The resistor $R_C$ represents the interconnection resistance associated with polysilicon interconnections between the cells. Because the polysilicon crosses over the sapphire substrate, the capacitance to substrate is essentially zero. Therefore, a net on the array can be represented as shown in the equivalent circuit in Fig. 5. If we assume a reference device size and normalize it, the load time constant as shown in the first row of the table is $(R_T + R_C) \, C_T$. Doubling the device width, as shown in row 2, yields a time constant $(R_T + 2R_C) \, C_T$ where $R_T$ is halved and $C_T$ doubled with respect to case 1. As shown in the third column, the time constant has increased with respect to case 1 by the quantity $R_C \, C_T$. Similarly, case 3 shows that by reducing the device width by one-half, the time constant has decreased, with respect to case 1, by $-R_C \, C_T/2$. Therefore, the delay associated with the load time constant has decreased by the quantity $-R_C \, C_T$ corresponding to an increase in speed as compared to case 1. Thus, the presence of the interconnection resistance $R_C$ produces the result that the smaller devices provide the higher speed and performance, if a zero signal coupling capacitance is assumed. For the case where $R_C = 0$, the time constant and therefore the switching speed are constant, independent of the device width. On the basis of this result alone, transistor sizes should be reduced to an absolute minimum to enhance circuit speeds.

With an assumed 30 crossovers/interconnect ($30 \times 0.01 = 0.3 \, \text{pF}$), the evaluation circuit of Fig. 2 was analyzed for the four transistor geometries ($1/3 \times$, $1/2 \times$, $1 \times$, and $2 \times$) and two assumed parasitic resistance values ($1.8 \, \text{k}\Omega$ and $0.5 \, \text{k}\Omega$).

When both the parasitic resistance and the crossover capacitance are included in the analysis, the curves of Fig. 4 continue to show maximum performance for reduced device size. The results show that optimum performance will be obtained for standard transistor geometries between $1/2 \times$ and $1 \times$ those in Fig. 2. Figure 6 illustrates this relationship. Past measurements on LSI arrays indicate nominal polysilicon interconnect resistances of $1.8 \, \text{k}\Omega$. Therefore, device sizes somewhat in excess of $1/2 \times$ would be appropriate to the initial design phase.

* The length of the polysilicon runs and the number of crossovers per interconnect are essentially functions of LSI array interconnection complexity and may therefore be considered essentially independent of transistor size.
Fig. 4. Pair delay vs transistor channel width.

Fig. 5. Speed vs device width.
Figure 6 also suggests that devices designed within the 1/2 X to 1X region will differ in nominal stage delay by 1/2 ns, at most. In other words, designs falling within the 1/2 X to 1X region will be optimized to within the accuracy limits of this analysis.

However, the impact of signal crosstalk must still be considered. Small devices with their inherently low input capacitance may, conceivably, be sensitive to the fixed coupling capacitors associated with the metal/polysilicon crossovers in the interconnect area. These crossovers may number from 0 to more than 100 in a typical array. As a means of estimating the effects of the crossover coupling, the circuit shown in Fig. 3 was analyzed. Two minimum size inverters, isolated by a 2-kΩ resistor (representing the resistive effects of more than 20 mils of polysilicon interconnection), were subjected to the simulated effects of 30 crossover lines switching simultaneously in the same direction. The switching waveforms were simulated by the oversize inverter I2. Noise spikes were observed at node A. For 30 and 60 crossover lines switching simultaneously, the noise spikes observed at node A were 3.0 V and 5.0 V, respectively. The short duration of the spikes at A (less than 6 ns) and the response time of inverter A-B combined to produce a maximum voltage swing of 0.5 V at node B. These favorable results, in addition to the conservative nature of the evaluation circuit (specifically, minimum size inverters experiencing the crosstalk of 60 simultaneously switching crossovers, etc.), led to the design of a standard cell family at the 1/2 X scale.

![Diagram](image)

Fig. 6. Pair delay vs transistor channel width. (Fanout equivalent to 3.6 two-input NORs.)
The analysis of the required standard cell heights started with a survey of the existing bulk CMOS standard cell family.* Figure 7 illustrates one such cell. Table 3 summarizes the typical cell height usage for the bulk family. As can be seen, out of 14.0 mils of cell height, the bulk family uses 3.8 mils with the required guard bands and spacings. An additional 1.0 mil is consumed by the cell I/O pads. The remaining 9.0 mils of cell height is used for device construction and intracell connections and can be apportioned nominally into 6.5 mils and 2.5 mils, respectively.

To determine the SOS cell height: If the area allotted to intracell connection is kept at 2.5 mils (for the new cell family), and if all SOS transistors are designed within the 1/2 X to 1X ratio (as determined earlier), an additional 4.0 mils of cell height will be needed for device design. Then allowing only 0.1 mil for the new SOS I/O pad height, we conclude that an SOS cell family, comparable in effective device and interconnection area to the bulk family, may be designed around the 7.0-mil height.

<table>
<thead>
<tr>
<th>Bulk CMOS Cell Height Statistics (mils)</th>
<th>Proposed SOS Cell Height Statistics (mils)</th>
<th>Cell Height Consumed By</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.8</td>
<td>0.4</td>
<td>Guard bands</td>
</tr>
<tr>
<td>1.0</td>
<td>0.1</td>
<td>I/O Pads</td>
</tr>
<tr>
<td>6.5</td>
<td>4.0</td>
<td>Devices</td>
</tr>
<tr>
<td>2.5</td>
<td>2.5</td>
<td>Device Interconnection</td>
</tr>
<tr>
<td>14.0</td>
<td>7.0</td>
<td>Total</td>
</tr>
</tbody>
</table>

2. Technical Analysis Techniques Verification

The validation and updating of both simulation techniques and device model parameters are of extreme importance in generating performance optimization analyses and delay-transition time data. In either case, it serves as the vital link between theoretical analysis and measured performance — in short, a calibration mechanism.

Figure 8(a) shows one of the circuits implemented on the ATL-011 test chip. It is one of the principal circuits used to validate the accuracy of the device model and simulation technique. The circuit model for the logic chain is shown in Fig. 8(b). The 1310 and 1520 buffer cells are designed with the 0.30-mil polygate width. Notice in Fig. 8(b) that an additional 1310 cell has been added to the front of the test circuit.

* CMOS array design techniques, Quarterly Report No. 2, Contract No. NAS12-2233.
Fig. 7. Two-input NOR layout.
V1

3120 L = .25 MILS
1310
1520 L = .3 MILS

(a) Two-input NOR test chain.

(b) Circuit model for two-input NOR test chain.

Fig. 8. Two-input NOR test chain and simulated circuit.
The program generated pulse is applied to node 3. The output of that circuit, node 5, whose waveform more closely approximates actual on-chip waveforms, becomes the input waveform for purposes of computing propagation delay. The model for the transistor is a sophisticated series of nonlinear equations that is incorporated into the circuit simulation program. The values of the SOS processing parameters used in the simulation were taken from representative samples of ATL-011 test chips and are shown in Table 4. These parameters were used, together with the model, to generate the propagation delay data for comparison with measured values.

<table>
<thead>
<tr>
<th>Symbol</th>
<th>Description</th>
<th>Value Used in Simulation</th>
</tr>
</thead>
<tbody>
<tr>
<td>$V_{TP}$</td>
<td>Threshold Voltage P-Transistor</td>
<td>-1.5 V</td>
</tr>
<tr>
<td>$V_{TN}$</td>
<td>Threshold Voltage N-Transistor</td>
<td>+1.5 V</td>
</tr>
<tr>
<td>$L_{eff}$</td>
<td>Effective Channel Length</td>
<td>0.2 mils</td>
</tr>
<tr>
<td>$\mu_p$</td>
<td>Effective Surface Hole Mobility ($V_{GS} = 0$)</td>
<td>400 cm$^2$/V·s</td>
</tr>
<tr>
<td>$\mu_n$</td>
<td>Effective Surface Electron Mobility ($V_{GS} = 0$)</td>
<td>500 cm$^2$/V·s</td>
</tr>
<tr>
<td>$N_D$</td>
<td>Donor Density</td>
<td>$1.5 \times 10^{15}$/cm$^3$</td>
</tr>
<tr>
<td>$N_A$</td>
<td>Acceptor Density</td>
<td>$1.5 \times 10^{15}$/cm$^3$</td>
</tr>
<tr>
<td>$T_{OX}$</td>
<td>Gate Oxide Thickness</td>
<td>1200 Å</td>
</tr>
<tr>
<td>$Slope \ N$</td>
<td>(In Saturation) Slope of Drain Characteristic $N$</td>
<td>0.02</td>
</tr>
<tr>
<td>$Slope \ P$</td>
<td>(In Saturation) Slope of Drain Characteristic $P$</td>
<td>0.01</td>
</tr>
<tr>
<td>$C_{GS}$</td>
<td>Capacitance (Gate-to-Source)</td>
<td>0.2 pF/mil$^2$</td>
</tr>
<tr>
<td>$C_{GD}$</td>
<td>Capacitance (Gate-to-Drain)</td>
<td>0.2 pF/mil$^2$</td>
</tr>
<tr>
<td>$R_P$</td>
<td>Resistance of Polysilicon Strip</td>
<td>60 ohms/□</td>
</tr>
</tbody>
</table>

In the first and third columns of Fig. 9(b), the measured propagation delay at 10 V of the two-input NOR test chain for test chip 101 and 105, respectively, are listed. The 15-ns delay from $V_{in}$ to $V_{out}$ for TC 105 is a repeat of the data presented by the actual waveforms in Fig. 9(a). Similarly, the 17.5 ns is the corresponding measured propagation delay for TC 101. Column 2, the simulated time delay, shows the results...
(a) Waveforms of measured delay.

(b) Measured and simulated results.

<table>
<thead>
<tr>
<th>TC 101</th>
<th>TC 105</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>MEASURED DELAY</strong> ns</td>
<td><strong>SIMULATED DELAY</strong> ns</td>
</tr>
<tr>
<td>$V_{IN}$ $\rightarrow$ 17.5 $\rightarrow$ $V_{OUT}$</td>
<td>$V_{IN}$ $\rightarrow$ 18.9 $\rightarrow$ $V_{OUT}$</td>
</tr>
<tr>
<td>$V_{2}$ $\rightarrow$ 7.7 $\rightarrow$ $V_{OUT}$</td>
<td>$V_{2}$ $\rightarrow$ 8.4 $\rightarrow$ $V_{OUT}$</td>
</tr>
<tr>
<td>$V_{7}$ $\rightarrow$ 6.4 $\rightarrow$ $V_{3}$</td>
<td>$\cdot$</td>
</tr>
</tbody>
</table>

**AVER. 6 STAGE DEL.** = 17.5 - 7.7 = 1.6 ns
**AVER. 4 STAGE DEL.** = 6.4 = 1.6 ns
**AVER. 6 STAGE DEL.** = 15 - 6.7 = 1.4 ns

Fig. 9. Validation of simulation techniques.
of the computer simulation for the delay through the two-input NOR chain using the model and parameters described. As shown, the predicted time delay through the complete test chain is 18.9 ns vs the measured delay of 17.5 ns for TC 101 and 15 ns for TC 105. Thus, the simulation in this case is somewhat conservative or pessimistic in the results it generates. It is clear that because of these high speeds even a 1-ns deviation is significant and measurably detectable. Since it is good design practice to operate with conservative estimates, the parameters that produce the 18.9-ns delay were those that were used to produce the cell data delay and characterization curves.

D. CMOS/SOS STANDARD CELL DESIGN LAYOUT

Elements of the basic standard cell and device design have been discussed in Section I. From performance considerations, we have defined basic device (n-type) sizes to be within the 1/2 X to 1X (0.5 to 1.0 mil) range. By analyzing typical cell height usage in earlier standard cell systems, and making the appropriate modifications for the silicon-gate SOS technology, we have defined a new standard cell height for the SOS family. However, not all cell geometry considerations have been discussed. In this section design considerations relating to the other cell geometries and topologies are discussed.

1. Cell Power and Ground Distribution

Contrasting with the single ground bus of the bulk technology standard cell systems, both power (+V) and ground potentials are now distributed via 0.4-mil metal bus lines. The absence of a conducting substrate in SOS necessitates this scheme. The power line is shared by two rows of cells. The cell rows are connected in a back-to-back fashion with the power line between them. Ground connections for the cells are accomplished by passing a ground bus across the lower edge (through the origin) of each cell. Figure 10 is a Calcomp composite plot of a two-input AND cell. The power and ground bus lines are broken for clarity prior to passing through the cell.

2. Cell I/O Pads

Also unique to the SOS standard cell system is a nonmetalized cell I/O pad. In essence, all logic connections to the cells are accomplished via the p-doped polysilicon tunnels. The tunnels are 0.70 mil wide and spaced at multiples of 1.0 mil. They pass under the ground bus and extend to 0.70 mil below the cell origin. Polysilicon was chosen for this connection because:

1. The majority of a cell's pads are input pads, and therefore inherently polysilicon anyway.

2. It provides a low resistance cell connection while still permitting the 0.40-mil ground bus to pass over all of the I/O leads without interference. Figure 10 illustrates the two-input AND cell with its three I/O pads. They appear under the extended ground bus. The two rightmost pads are input connections.
An additional feature of this all-polysilicon-pad approach is that all metal connections to a cell must be supplied with a metal-polysilicon contact arrangement. Such an arrangement is automatically provided by the software, in the form of a "tunnel-end" cell, whenever the metal-polysilicon interconnect is required. This feature permits the routing software to make efficient use of the first (closest to the cell pads) wiring channel by running a metal interconnect over all intermediate pads. (In the bulk standard cell system, the use of this first channel was blocked because the metal-tunnel interconnect was an inherent part of all cell I/O pads.)

3. Special Layout-Design Considerations for Standard Cells

After design and before use, each standard cell must be reviewed not only for layout rule violations internal to the cell boundaries, but also for all possible rule violations that may occur when any of the cells are placed next to each other. A simple way to accomplish this exhaustive and impractical requirement is to define a standard
interface for the top, bottom, and both sides of each cell that would "spell out" the proximity relationships between all edges of the cell and all six photomask levels. With such an interfacing scheme, all possible intracell rule violations are automatically avoided. A set of rules to govern cell layout-design at the cell edges is suggested as follows:

Rule 1. Standard cell height is 7.0 mils and is measured from the center of the ground bus to the center of the V_{DD} bus.

Rule 2. The ground bus and V_{DD} bus are each 0.4 mil wide and are centered on the bottom and top lines of the cell, respectively (the bottom of the cell being defined as that portion closest to the cell I/O polysilicon pads). The ground and V_{DD} bus lines are not part of the standard cell; both bus lines will be automatically placed when the intracell wiring is done.

Rule 3. Cell width is an integer multiple of 1.0 mil measured between the origin and antiorigin of the cell.

Rule 4. All polysilicon or metal must be at least 0.2 mil from the left edge (defined as a vertical line drawn from the origin) of the cell and 0.1 mil from the right edge (defined as a vertical line drawn from the antiorigin) of the cell.

Rule 5. All N and P epitaxial silicon must be at least 0.25 mil from the left edge of the cell and 0.15 mil from the right edge of the cell.

Rule 6. The N+ diffusion mask must be at least 0.05 mil from the left edge of the cell and can overlap the right edge of the cell by no more than 0.05 mil.

Rule 7. The cell I/O pads are treated as tunnel (polysilicon) ends and are inserted into the cell design as stubs of polysilicon extending 0.7 mil below the origin. The I/O pads are placed on 1.0-mil centers and must be at least 0.7 mil wide from the $y = -0.7$ point to the $y = -0.4$ point. At and above the $y = -0.4$ point, the polysilicon stub can be made any width consistent with the SOS design rules.

Rule 8. The N- epitaxial silicon (level 2) may run up to the center of the V_{DD} bus if it is to be electrically connected to the V_{DD}; otherwise, it must remain 0.30 mil below the center line.
Rule 9. The $P^+$ epitaxial silicon (level 1) may run to within 0.40 mil of the $V_{DD}$ bus center line. (This is either $y = 4.60$ mils or $y = 6.60$ mils, depending on cell family.)

Rule 10. Polysilicon (level 3) may run to within 0.20 mil of the $V_{DD}$ center line. (This is either $y = 4.80$ mils or $y = 6.80$ mils, depending on cell family.)

Rule 11. The $N^+$ diffusion mask may run to within 0.20 mil of the $V_{DD}$ bus center line. (This is either $y = 4.80$ mils or $y = 6.80$ mils, depending on cell family.)

Figures 11 and 12 illustrate layout rules 1-7 and 8-11, respectively. Figure 13 is a topological plot of the two-input AND cell with levels 1 through 6 plotted separately. (Level 7 is used only to open the protective oxide at the chip I/O bonding pads.)

E. CMOS/SOS STANDARD CELL ARRAY TOPOLOGY

SOS standard cell array layouts follow the basic scheme illustrated by the metal mask of Fig. 14. The array topology falls into four distinct areas: I/O bonding pads, power/ground buses, cell interconnections, and logic cells.

1. I/O Bonding Pad Area

The I/O bonding pad area runs along the periphery of each chip. Although primarily intended for pad placement, efficient area utilization is guaranteed by also using this area for gate oxide protection devices, alignment keys, test transistors and special off-chip buffer circuitry. The protective devices are automatically placed whenever a chip I/O bonding pad is called for. When going off chip, the designer has the choice of using either a buffered or an unbuffered output pad. In short, logic designers and system partitioners need choose only the nature of the on/off-chip transition -- a buffered output, an unbuffered output or an input -- the rest is automatic.

Gate oxide protection against electrostatic discharge is provided with a stack (series connection) of eight diodes. One such stack is connected between each I/O pad and one of the buses. Alternate, heavily doped, $n^+$ and $p^+$ diffusions form four forward-biased and four back-biased diodes. With 6-volt breakdown for each of the four zeners and 0.7 volt for each of the four forward diodes, the stack has approximately a 27-volt breakdown in either direction. Figure 15 includes a schematic, cross-section and topological representation of this device. The 10 mils needed for its implementation fits well within the interpad space and, for this reason, is included as an implicit part of all I/O pad designs.
Fig. 11. Layout rules 1-7.

NOTE:
1. THREE IO PADS ARE SHOWN IN THREE POSSIBLE PAD CONFIGURATIONS. NO SIGNIFICANCE IS ATTACHED TO THE CONFIGURATION WITH RESPECT TO PAD LOCATION.
2. ALL DIMENSIONS ARE IN MILS.
Diode leakage averages less than 1 μA per input or output pad at 10-volt bias. Since there has not been any known gate failures on protected gates, special handling is not believed necessary. Nevertheless, it is still recommended that care be exercised in handling the arrays.

The optional off-chip buffer circuitry located along the chip's periphery offers the designer the capability of "going off-chip" with drivers appropriately scaled for larger capacitances. They can drive 30-pF external loads with rise or fall times of 18 ns (nominal). Any standard cell may be used to drive the off-chip buffers. Using, however, on-chip buffers as drivers generally improves dynamic performance. Peripheral placement of the off-chip buffers is utilized because:

(1) Large drive capability is rarely needed for driving the small on-chip loads.

(2) Such placement prohibits the possibility of encountering the large RC delays associated with the use of resistive tunnels between drivers and the off-chip load.
Fig. 13. Composite and levels of two-input AND (sheet 1 of 2).
Fig. 13. Composite and levels of two-input AND (sheet 2 of 2).
Fig. 14. Typical metalization level for SOS standard cell array.
2. Power and Ground Distribution

The power and ground distribution buses are clearly identifiable as the two wide lines running along the periphery of each chip (Fig. 14). By passing these buses through each I/O pad cell, the ground and +V potentials are available for both output buffer circuitry and oxide protection devices. However, this scheme results in the necessity to pass all on-chip and off-chip signals under the two bus metallizations with either an epitaxial or polysilicon tunnel. (The added connection resistance associated with this is negligible -- about 200 ohms.) The modified interdigitated bus layout places both +V and ground buses in convenient locations for subsequent logic cell row connections.

F. CMOS/SOS STANDARD CELL LIBRARY

1. General

The CMOS/SOS standard cell library is an open-ended collection of logic circuits designed to be fabricated with either the double-epitaxial pilot line process or the single-epitaxial SOS process. All circuits have gate lengths of 0.25 mil for optimized performance.

All standard cells have been defined, designed, topologically configured in accordance with the standard set of SOS design and process constraints, analyzed, and then permanently stored for future use on magnetic tape. The present cell library was
designed to meet current and anticipated LSI implementation needs of the NASA SUMC-CVT computer system. However, because it is an open-ended system the user can define and design new cells to meet unique system requirements. A list of the present CMOS/SOS standard cell family is contained in Table 5.

To enable the addition of new cells and facilitate maximum use as a design tool, the data sheets listing the properties and performance of the cell family and the necessary supporting instructions are described in a separate document -- the CMOS/SOS Standard Cell Notebook. The notebook contents and its use are briefly described in the following paragraphs.

2. Standard Cell Notebook

The CMOS/SOS Standard Cell Notebook contains the following information:

(1) Data sheet for each of the 21 cells that constitute the present family.

(2) Functional description of each of the cells in the library.

Each data sheet contains the following information:

- Cell Name
- Cell Number
- Cell Width
- Schematic Diagram
- Logic Symbol
- Truth Table
- Pertinent Cell Node Capacitance
- Performance Data (Delay and Transition Times vs Load Capacitance)

The propagation delays and transition times, as given on the data sheets, were originally generated using the RCA CMOS/SOS circuit simulation program. The device, circuit, and process parameters used in the simulation were based heavily on the parameters determined from measurements on SOS standard cell test chips.

The dynamic data format for each cell depends upon the function of the cell. Generally performance information is presented as a straight-line graph with load capacitance and performance scales plotted on the X-axis and Y-axis, respectively. The points located on each graph define the nominal performance of each cell as a function of loading. Therefore, the propagation delay curves define estimated delays that are expected to occur for each cell.
<table>
<thead>
<tr>
<th>Cell Number</th>
<th>Cell Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>1120</td>
<td>Two Input NOR</td>
</tr>
<tr>
<td>1130</td>
<td>Three Input NOR</td>
</tr>
<tr>
<td>1140</td>
<td>Four Input NOR</td>
</tr>
<tr>
<td>1220</td>
<td>Two Input NAND</td>
</tr>
<tr>
<td>1230</td>
<td>Three Input NAND</td>
</tr>
<tr>
<td>1240</td>
<td>Four Input NAND</td>
</tr>
<tr>
<td>1310</td>
<td>Inverter</td>
</tr>
<tr>
<td>1340</td>
<td>$2 \times 1$ Multiplexer (Single Clock)</td>
</tr>
<tr>
<td>1370</td>
<td>Transmission Gate</td>
</tr>
<tr>
<td>1510</td>
<td>Non-Inverting Buffer</td>
</tr>
<tr>
<td>1520</td>
<td>Buffer Inverter</td>
</tr>
<tr>
<td>1620</td>
<td>Two Input AND</td>
</tr>
<tr>
<td>1630</td>
<td>Three Input AND</td>
</tr>
<tr>
<td>1640</td>
<td>Four Input AND</td>
</tr>
<tr>
<td>1720</td>
<td>Two Input OR</td>
</tr>
<tr>
<td>1730</td>
<td>Three Input OR</td>
</tr>
<tr>
<td>1740</td>
<td>Four Input OR</td>
</tr>
<tr>
<td>2310</td>
<td>Exclusive 'OR'</td>
</tr>
<tr>
<td>2820</td>
<td>D Type M/S FF</td>
</tr>
<tr>
<td>9060</td>
<td>Off Chip Inverting Buffer Pad</td>
</tr>
<tr>
<td>9070</td>
<td>Off Chip Inverting Buffer Pad</td>
</tr>
</tbody>
</table>
Deviations between a given cell's performance and that anticipated by its data sheet may be attributable to the normal variations in the SOS process. (For example, normal variations occur in mask alignments, diffusion depths, gate oxide thickness and doping levels.) In addition, there are a host of second-order effects that are independent of processing. These include a dependence on the rise (or fall) time of the input signal and the particular input (on a multiple-input gate) to which a signal is applied.

Estimated delays based on these data sheets will generally be within 10% of the average measured delays, and therefore should not be considered worst case numbers.

All dynamic propagation information is based on an assumed +10.0-V supply voltage, an ambient temperature of 25°C, and a 10-ns transition time for the driving signal. Delays are measured between the 50% points of the input and output signals. The processing parameters assumed in the performance analysis are those of the SOS double-epitaxial process.

The primary purpose of the standard cell data sheets and the associated supplementary discussion in the notebook is to provide the logic and system designer with sufficient information about each of the standard cells so that he can optimize his selection of the available standard circuits. This should enable the designer to avoid race conditions, optimize critical path delays, avoid excessive loading conditions, avoid improper cell usage, and estimate circuit speed. This forms the basis for design comparison and evaluation before the arrays are processed.

G. CMOS/SOS LSI CHIP MEASUREMENT AND ANALYSIS

1. General

This section covers the chip measurements and corresponding analyses of device characteristics and dynamic performance tests made on several CMOS/SOS standard cell array types. The CMOS/SOS test chip (Fig. 16) served as the principal vehicle for examining and evaluating the individual 7.0-mil SOS standard cell circuits. The test chip provided I/O pad connections to several test transistors and a large number of logic chains as well as many special purpose evaluation circuits. From these test circuits, it was possible to obtain:

(1) Accurate information about the effect of transistor geometries on device drive capabilities

(2) Empirical data concerning the absolute and relative performance characteristics of each of the standard cell types

(3) Additional insight into the optimizing standard cell array design for improved circuit and system performance.
Fig. 16. CMOS/SOS standard cell test chip.
The electrical performance characteristics of five other SOS double-epitaxial array types, that were designed and fabricated for the SUMC-CVT project, were also examined. These arrays are designated as follows:

- ATL-026A Floating Point Multiplexer
- ATL-027 Up/Down Counter
- ATL-030 8-Bit Adder with Carry Prediction
- ATL-031 9-Bit \( 4 \times 2 \) Multiplexer
- ATL-032 Adder-Multiplexer Control

From these measurements, the empirical data needed to confirm earlier estimates of the SOS standard cell performance in an LSI environment were extracted. The logic paths chosen for the delay measurements were those identified as either critical paths to the SUMC-CVT operation or extended paths representative of on-chip performance.

2. Test Chip Measurements

The specific objective of the test chip was to provide a direct means for measuring and evaluating the 7.0-mil CMOS/SOS standard cell circuits. Updated device models and enhanced simulation techniques are two other goals of this approach. A brief summary of the tests included on the chip and the data collected from them is presented in the following paragraphs.

a. Test Transistors and Transistor Characteristics

Three pairs of n- and p-type transistors with channel lengths of 0.25 mil, 0.30 mil, and 0.35 mil serve as the means of collecting the test transistor characteristics. Figure 17 shows typical chain characteristics for a 0.25-mil-channel-length transistor pair. Average drain currents, taken from several wafer lots, for each of the 0.25-mil and 0.30-mil channel length pairs, are listed in Table 6. The average 0.25-mil n-device current, \( \text{IDN} \), of 4.2 mA, represents a current of 16 mA/mil. Similarly, the average 0.25-mil p-device current, \( \text{IDP} \), of 3.2 mA represents a current of 1.3 mA/mil. Both the n- and p-device currents approach, very closely, the bulk silicon values. These currents, measured as indicated on Table 6, serve as a useful figure-of-merit for estimating circuit performance.

<table>
<thead>
<tr>
<th>Transistor Characteristic</th>
<th>Channel Length (mils)</th>
<th>n-Devices</th>
<th>p-Devices</th>
</tr>
</thead>
<tbody>
<tr>
<td>Drain Current (Average)</td>
<td>0.25</td>
<td>4.2 mA</td>
<td>3.2 mA</td>
</tr>
<tr>
<td>( V_g = V_D = 10 \text{ V} )</td>
<td>0.30</td>
<td>3.7 mA</td>
<td>2.2 mA</td>
</tr>
<tr>
<td>Threshold Voltage (Average)</td>
<td>0.25</td>
<td>1.5 V</td>
<td>-1.5 V</td>
</tr>
<tr>
<td>( V_{TH} = V_g @ I_D = 10 \mu A )</td>
<td>0.30</td>
<td>1.5 V</td>
<td>-1.5 V</td>
</tr>
</tbody>
</table>
Fig. 17. Typical drain characteristics.

Table 6 also shows measured threshold voltages for both the 0.25-mil and the 0.30-mil channel length test transistors. The threshold voltages for the test chip can be considered typical, perhaps a little higher than "average".

b. Polysilicon Interconnects

Two 100-mil-long polysilicon strip, one covered with phosphorous (N) doped glass and the other coated with boron (P) doped glass, and provided as a means of measuring the absolute and relative resistivities of the polysilicon interconnects.
Each strip is 250 squares from end to end and therefore permits an accurate resistance
determination. Test chip measurements indicate average values of 52 ohms/sq and 200
ohms/sq for the P-doped and N-doped polysilicon, respectively. A consequence of this
result is that whenever polysilicon is used as an interconnect medium, only P-doping will
be permitted.

c. Device Fanout

Three separate chains of cascaded two-input NOR cells, with fanouts
of 1.0, 2.0, and 3.6, provide the means of determining, empirically, the effects of
performance vs loading. However, this experiment was carried out for devices with
0.30-mil channel lengths, not the 0.25-mil channel lengths used in all of the SUMC-
CVT chips. Nevertheless useful comparative data can still be obtained from this
experiment. Figure 18(a), (b) and (c) illustrate the three strings of cascaded NOR
gates with their associated dummy loads. Figure 19 is a photograph of the input and
output waveforms associated with the fanout = 1 chain. The 24-ns delay is represent-
ative of the eight cascaded stages when 0.30-mil-channel-length devices are used.
For this case, the on-chip stage delay is 24 ns ÷ 8 stages = 3 ns/stage. With the aid
of simulation, the effects of the two output buffering stages may be eliminated. When
this is done, the internal two-input NOR stage delays are 2.4 ns, 5.4 ns and 7.9 ns,
for fanouts of 1.0, 2.0, and 3.6, respectively. Fig. 20 is a plot of this data.

d. Channel Length

Two test chains of two-input NOR cells differing only in the designed
channel lengths (0.25 mil and 0.30 mil) permit a simple side-by-side comparison of
performance vs gate length. Figures 18(a) and 18(b) illustrate the two test chains.
The photograph in Fig. 19 shows the superimposed input and output waveforms for the
two test chains. An approximate 40% difference in the 0.25-mil and 0.30-mil channel
length chain delays can be observed. For the shorter channel length circuits, the
15-ns overall stage delay works out to be less than 2-ns/stage. Figure 20 illustrates
the projected stage delay for a family of 0.25-mil circuits as a function of fanout.

e. 8-Bit Counter Operation

A three-stage or 8-bit counter circuit utilizing three 1820 M/S flip-flops
is illustrated in Fig. 21(c). The output of the third 1820 stage is buffered through a
1310 and a 1520 cell before coming offchip. All circuitry for this test employs 0.30-
mil gate lengths. Test results, presented in Fig. 21(a) and (b), indicate that at 10
volts the 1820 cell can be toggled at approximately 75 MHz. The input clock pulse
width to the first stage used in this test was 4 ns. A further reduction in the pulse
width was limited by the capabilities of the available test equipment.
Fig. 18. Two-input NOR stage delay test circuits.
Fig. 19. Measured two-input NOR chain delay.
Fig. 20. Measured two-input NOR stage delay vs fanout.

Measurements were also made between the negative edge of the input clock pulse and a change in state of the output. This measurement represents a delay through three 1820 slave stages, four 1310 stages, and one 1520 buffering circuit (with a 20-pF load). This delay measurement averaged about 42 ns or roughly 5 ns/stage. Figure 21(d) and 21(e) are the scope tracing for these measurements.

Based on the performance improvements recorded for the 0.25-mil two-input NOR circuits, it is estimated that 100-MHz operation would result if the counter had been implemented with the smaller 0.25-mil gate lengths.

3. SUMC-CVT Double-Epitaxial, SOS Standard Cell Array Measurements

All SUMC-CVT arrays are designed with 0.25-mil gate lengths for optimized performance. As part of the eventual screening and sorting procedure that will be used to separate the fabricated arrays into performance categories based on measured propagation delay characteristics, all arrays are dynamically tested. In many cases the measurements are taken at the wafer probe level since many chips are designed for hybrid mounting rather than standard dual-in-line packages (DIPs) or flat-packs. The data arrived at in this manner can serve as an excellent source of material for evaluating the SOS cell family in an actual LSI environment. In addition, this information can be used to validate and further enhance our techniques for standard cell array performance prediction using the standard cell notebook.
Fig. 21. 8-Bit counter delay measurements.
Consequently, for each of the five SUMC-CVT array types examined on this program, the logic path over which the data was taken, the total measured delay (averaged over several units), and the calculated delay (based on the standard cell data sheets) were compared. And finally, an "average delay per stage" was calculated for each chip type. Generally, the logic paths investigated were those identified as "critical paths". In other cases longer logic paths were chosen to obtain measurements less influenced by off-chip loading.

a. Floating Point Multiplexer, ATL-026A (155 x 134 mils, ~ 163 gates)

This chip is a 2 x 1 shifting multiplexer. In the SUMC-CVT system, it operates on every fourth bit, either shifting ±4 bit positions or passing the bits straight through. Provisions are included for mixing two extraneous inputs. The primary input path is 8 bits wide, while the shifted output is 9 bits wide. The chip is totally combinatorial.

The logic path chosen for measurement consists of six levels and is shown in Fig. 22(c). The recorded chain delay ranged from 17 ns to 34 ns. Figures 22(a) and 22(b) are photographs of the input and output waveforms for the 17-ns measurement. Averaging this time over the four standard cells yields 4.3 ns/cell. If, however, this delay were averaged over the actual number of logic levels used to implement the chain, the average delay per logic level would be closer to 2.9 ns.

Calculating the total chain delay with the standard cell data sheets, however, gives a predicted delay of 27 ns; and indeed when a larger number of ATL-026A arrays were examined, the average delay for the path did work out to be 23 ns. This is within 20% of the value obtained from the data sheets.

b. Up/Down Counter Chip, ATL-027 (199 x 199 mils, ~ 300 gates)

This array consists essentially of a 12-bit up/down counter divided into separate 8-bit and 4-bit sections. Each has separate controls, but only the 8-bit section has carry and borrow outputs for expansion. Each stage has a two-input multiplexer for pre-setting the count value. Counting is done in a ripple carry/borrow full adder subtracter. The counter outputs are tri-state, buffered elements.

Figure 23 illustrates the logic path to be used in chip operation. It contains six cascaded standard cells or eight levels of logic. The "clock" is used to transfer data from the £820 storage elements to the output which, as shown, is loaded with 10 pF. For these measurements, the external chip controls are set to toggle the flip-flop with every clock pulse. Delays are measured between the 50% point of the negative-going clock to the 50% point of the output. Delay results for several packaged units ranged from 42 ns to 59 ns. The average delay was 52 ns. Calculations based on the standard cell data sheets predict 46 ns for the same path (which is correlation to within 11%). On a per stage basis, the measured average is (52 ns ÷ 9 stages) = 5.8 ns.
Fig. 22. Measured delay for floating point multiplexer.
Fig. 23. Calculated delays for up/down counter.
This array is an 8-bit binary or decimal adder. It is fully expandable and has carry anticipate logic for fast arithmetic operations, a multiplexed B input, a data complement and logical capability, and several special condition outputs. The logic path examined is a portion of the SUMC-CVT 32-bit adder's critical path. It is composed of 11 cascaded standard cells. Depending upon the method used, this works out to be 11 to 13 levels of logic. Figure 24 illustrates this path. Measurements are made from the 50% point of signal at the input to the 50% point of output signal. Data were taken at both the packaged device (64-pin DIP) and wafer probe levels. For a 15-pF output load, packaged device delays averaged 73 ns. The average measured device delay is 73 ns + 12 stages = 6 ns. For an output load of 100 pF, the total delay averages 105 ns. Figure 25 shows typical photos of the input and output waveforms of the 64-pin DIP packaged unit measurements. In terms of the SUMC-CVT program, the 15-pF load measurements are more directly applicable than the 100-pF load case since 15 pF is more typical of the anticipated on-hybrid loading.

From a system point of view, the adder path is extremely critical. With this in mind, calculations based on the 7.0-mil SOS standard cell data sheets were performed. From the Calcomp checkplot, of the adder array, the additional loading contributed to the output of each cell in the adder path by the wiring crossovers was precisely accounted for. Figure 24 shows the crossovers on each output in the critical path. It also includes the calculated delays on a cell basis. From these calculations, it is possible to discount the large 9060 delay associated with going off-chip. If this is done, the estimated on-chip delay is 5.4 ns/logic level. Totaling all calculated delays for the path, we arrive at a total estimated delay of 75 ns which closely correlates to the measured average of 73 ns.

A further investigation was carried out. It centered on the n- and p-type transistor saturation currents assumed for the standard cell data sheet delay curve calculations and those of the actual devices on the fabricated arrays. (The actual device currents were measured on the array's output inverters and were within 10% of those assumed for the delay curve calculations.)

d. 9-Bit 4 x 2 Multiplexer, ATL-031 (172 x 175 mils, ~150 gates)

This array is a flexible multiplexer that may be used as either a 9-bit 4 x 2 multiplexer or an 18-bit 2 x 1 multiplexer. Its mode of operation is determined with four independent control lines. The chip is 100% combinatorial. The longest logic path on the chip is five cells long, or in terms of logic levels, six levels deep. Figure 26 shows a small portion of the array’s logic. The path over which measurements are taken has been highlighted. Dynamic measurements were made between the 50% levels of the input and output signals. Measurements made at the wafer probe level produced a range of delays for the six logic levels from 50 to 80 ns. This included the loading
Fig. 24. Critical path of 8-bit multiplexed input adder.
Fig. 25. Typical input-output waveforms of 8-bit adder.
Fig. 26. Calculated delays for 9-bit $4 \times 2$ multiplexer.
effects of the test equipment cables which was measured at more than 100 pF. The average total delay was approximately 60 ns. From the standard cell data sheets, the total path delay is calculated to be 50 ns; most of which is due to the output stage. By separate measurements, the cabling alone introduces 12 ns of delay. Taking this into account, we have an average measured delay of \((60 - 12) / 6 = 8\) ns/stage. (This figure drops considerably when the delay of the output buffer is neglected. Under these conditions, we have \((60 - 12 - 30) / 5 = 3.6\) ns/stage.)

e. Adder-Multiplexer Control, ATL-032 (154 × 139 mils, ~166 gates)

This array houses the special random control logic which combines the ROM outputs with data dependent conditions. The array's outputs serve to route the adder inputs and specify the adder operations appropriate to the instruction being executed. Figure 27 shows the logic path used to examine the array's performance. It was chosen primarily because of its length -- 10 standard cells long. In terms of the way the cells are implemented, this path may be considered to be 13 levels of cascaded logic. Measurements are made from the 50% point of the input clock signal to the 50% point of output signal. The "clock" is used to transfer data from the 2820 storage element to the output. The output is connected back to another chip input which provides for flip-flop toggle action on every "clock" pulse. Measured delay data, on packaged units, averages 67 ns for the path when the off-chip loading is 18 pF. It averages 76 ns for the path when an external load of 40 pF is used. For 18-pF load, the average measured device delay is \(67 / 13 = 5.2\) ns.

Calculations for the same path were made using the standard cell data sheets and a Calcomp checkplot of the array's topology. The latter provides a precise count of metal/polysilicon crossovers at each cell's output. The crossovers are considered because their loading effect is not negligible. Figure 27 shows the calculated delays for each individual cell in the measured path. From these calculations it is possible to discount the large delay associated with going off-chip through the 9060 element. When this is done, the average measured on-chip stage delay is \((67 - 9) / 12 = 4.8\) ns. Totaling all the calculated delays for the path, we come up with 65.5 ns for the case of 18-pF external load and 70.5 ns for the case of a 40-pF external load. For both calculations the predicted values are within 10% of the average measured values. Figure 28 shows photographs of the input and output waveforms for loads of 18 pF and 40 pF.

Table 7 is a compilation of the measured and calculated on-chip/off-chip delays for five SUMC-CVT SOS arrays. The delays are averaged overall logic levels in the path, and in some cases averaged over only those cells internal to the array. The latter calculation is done by discounting the large off-chip buffering delays. Very good correlation is noted between the averaged measured and calculated delays.
Fig. 27. SUMC-CVT CMOS/SOS adder control chip measurement path.
Fig. 28. Typical input-output waveforms of adder control.

(a) $C_{OFF\text{-CHIP}} = 18 \text{ pF}$

(b) $C_{OFF\text{-CHIP}} = 40 \text{ pF}$
### TABLE 7. SUMC-CVT CMOS/SOS LSI ARRAY PERFORMANCE MEASUREMENTS*

<table>
<thead>
<tr>
<th>Array Name</th>
<th>Total Delay (ns)</th>
<th>Measured Average Stage Delay (ns)</th>
<th>Estimated On-Chip Stage Delay (ns)</th>
<th>Off-Chip Load (pF)</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Array Name</strong></td>
<td><strong>Measured</strong></td>
<td><strong>Calculated</strong></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Floating Point Multiplexer (ATL-026A)</td>
<td>23</td>
<td>27</td>
<td>4.3</td>
<td>12</td>
</tr>
<tr>
<td>Up/Down Counter (ATL-027)</td>
<td>52</td>
<td>46</td>
<td>5.8</td>
<td>10</td>
</tr>
<tr>
<td>8-Bit Adder (ATL-030)</td>
<td>73</td>
<td>75</td>
<td>6.0</td>
<td>15</td>
</tr>
<tr>
<td></td>
<td>105</td>
<td>103</td>
<td>5.4</td>
<td>100</td>
</tr>
<tr>
<td>9-Bit 4 × 2 Multiplexer (ATL-031)</td>
<td>60</td>
<td>50</td>
<td>8.0</td>
<td>100</td>
</tr>
<tr>
<td>Adder-Multiplexer Controls (ATL-032)</td>
<td>67</td>
<td>66</td>
<td>5.2</td>
<td>18</td>
</tr>
<tr>
<td></td>
<td>76</td>
<td>71</td>
<td>4.8</td>
<td>40</td>
</tr>
</tbody>
</table>

* All delay measurements at 10 V.
Extensive leakage data has not been collected to date since many of the arrays examined were not 100% functional. Arrays in this category may well have artificially large "leakage currents" caused by internal floating gates, etc. Measurements to date have verified this fact on non-100%-functional units since many leakage currents varied over a couple of orders of magnitude.

II. CONCLUSIONS AND RESULTS

A low cost, quick turnaround technique for generating custom CMOS/SOS LSI arrays using the standard cell approach was developed, implemented, tested and validated. This was, in essence, the objective of this program. To achieve this result, a series of intermediate objectives and goals had to be, and were, accomplished. These accomplishments and results include the following:

(1) The Automatic Placement and Routing Computer program was modified and enhanced to ensure compatibility with, as well as to optimize the performance of, the self-aligned silicon-gate CMOS/SOS technology.

(2) A basic cell design topology and guidelines were defined based on an extensive analysis that included circuit, layout, process, array topology and required performance considerations -- particularly high circuit speed. A standard cell height of 7 mils and a minimum pad spacing of 1 mil were established. In addition to meeting the principal design consideration of speed, the cell area of CMOS/SOS was dramatically reduced compared to that for CMOS bulk standard cell design. For example, a two-input NOR requires 79.8 square mils in the metal-gate standard cell family and only 21 square mils in the CMOS/SOS standard cell family with virtually the same size devices -- a reduction of almost four to one.

(3) A family of 11 self-aligned, silicon-gate CMOS/SOS standard cell circuits was developed. For each cell type this included the circuit design, topological layout, performance validation through circuit simulation, electrical characterization and documentation in the form of user-oriented data sheets. In addition, the performance of virtually all of the cells was experimentally verified either as a result of measurements taken on the CMOS/SOS test chip (NAS12-2233) or on five LSI arrays implemented for the SUMC-CVT computer system (Contract NAS8-29072).

(4) The silicon-gate CMOS/SOS test chip was designed not only to provide experimental validation that the standard cells functioned properly but also, more critically, to determine that their dynamic performance correlated with the predicted delays based on computer simulation. This performance validation was verified. For example, the average stage delay for a two-input NOR circuit in a serial logic chain containing eight levels of logic was less than 1.6 ns for the devices with 0.25-mil channel lengths and approximately 2.4 ns for devices with a 0.3-mil channel length. In addition, the test chip provided
device and characterization data which was used to update the values of the device model parameters used in the computer simulation program. By such means the accuracy of the speed predictions based on computer simulation techniques is increased.

(5) Since some of the cells developed had not been designed when the test chip was laid out and fabricated, they do not appear on the test chip. These cells were experimentally validated by the measured data taken on five CMOS/SOS custom LSI arrays designed for the SUMC-CVT computer system. These custom LSI arrays varied in complexity from a 150-gate 4 x 2 multiplexer array to a 450-gate, multiplexed, 8-bit adder array.

(6) The correlation between the average measured delay and the delay predicted by the computer simulation program was excellent -- well within design tolerances. For example, the difference between the average measured delay and computer-predicted delay for specially selected logic paths on each of the five chips can be seen in the table:

<table>
<thead>
<tr>
<th>Custom Standard Cell CMOS/SOS Array Types</th>
<th>Computer Predicted Delay (ns)</th>
<th>Average Measured Delay (ns)</th>
<th>Measured Average Stage Delay (ns)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Floating Point Multiplexer</td>
<td>27</td>
<td>23</td>
<td>4.3</td>
</tr>
<tr>
<td>Up/Down Counter</td>
<td>46</td>
<td>52</td>
<td>5.8</td>
</tr>
<tr>
<td>8-Bit Adder with Carry Prediction</td>
<td>75</td>
<td>73</td>
<td>6.0</td>
</tr>
<tr>
<td></td>
<td>103</td>
<td>105</td>
<td>7.1</td>
</tr>
<tr>
<td>9-Bit 4 x 2 Multiplexer</td>
<td>50</td>
<td>60</td>
<td>8.0</td>
</tr>
<tr>
<td>Adder-Multiplexer Control</td>
<td>66</td>
<td>67</td>
<td>5.2</td>
</tr>
</tbody>
</table>

As seen in the table, there is generally good correlation between the predicted and measured results. Differences fall within design tolerance. The major significant of the correlation between predicted and measured results is that the circuit speeds achieve the dynamic performance objectives for which they were designed -- the NASA SUMC-CVT computer system program.
APPENDIX

SILICON GATE BEAM LEAD TECHNOLOGY

Two chip types, a four-bit adder and a twelve-bit adder were designed as test vehicles to evaluate the SG-BL technology. In addition, a twenty-bit adder hybrid was also designed using five of the four-bit adder chips. Each of the chip designs used standard cells developed under this program. The PR2D placement and routing program, which was successfully used to automatically interconnect MG CMOS standard cells for LSI, was modified to conform to the topological restraints imposed by the SG-BL standard cell technology. Both of the SG-BL test chips made extensive use of the modified PR2D program to place the standard cells and to interconnect them in the required logic pattern. Minimum manual modification was required, and this was constrained primarily to the special test circuitry that was included on each chip. The output of the PR2D program was used to generate drive tapes from which the 12-level Gerber artwork was created.

One of the test vehicles used to evaluate the SG-BL process was ATL-049. The ATL-049 consists of a four-bit adder and special test circuitry.

The four-bit adder portion of the chip (shown in Fig. A-1) is a duplicate of the circuitry contained on a MG CMOS chip (ATL-004A). By duplicating this circuitry on the ATL-049, a one-to-one speed comparison between the two technologies could be made. Extensive data taken from the ATL-004A chip was available to make this comparison.

In addition to the four-bit adder, special test circuitry was included on the ATL-049 to further evaluate the SG-BL process and standard cell development. The test circuitry on the ATL-049 included: 1) an inverter with uncommitted sources, 2) two six-stage logic chains (one with .25 mil gate lengths and one with .2 mil gate lengths) consisting of five two-input NOR gates and an inverter, 3) a six-stage logic chain consisting of five EOR gates and an inverter and 4) a six-stage logic chain with intermediate logic outputs bonded out so that a pair delay measurement could be made. Figure A-2 details the test circuitry contained on the ATL-049.

The test vehicle philosophy that was adopted for this program centered on developing confidence in the standard cell designs and the silicon-gate beam leaded process by initially concentrating on the ATL-049 four-bit adder. In compliance with this philosophy, the ATL-049 logic was implemented using the modified PR2D program. Working process masks were created so that the test samples of the ATL-049 could be fabricated and evaluated.
Fig. A-1. Arithmetic and logic unit on SG-BL ATL-04 test chip.
Several batches of ATL-049 were processed and evaluated. Initial testing revealed high chip leakage, low operating voltages and excessive non-functional operation. In response to this data, modifications were made to both the circuit design rules and to the processing procedure. Incorporation of these modifications resulted in a batch of ATL-049 chips from which characteristic circuit data could be taken on the special test circuitry. The data taken from two of the special test chains and on the inverter with uncommitted sources is presented in Figs. A-3 through A-6.

Figures A-3 and A-4 present the data taken on PMOS and NMOS test transistors. Each transistor had a gate width of 2.0 mils. The drain-to-source breakdown voltage of the NMOS devices averaged 17.3 volts with no unit breaking down below 16.5 volts. Average breakdown of the PMOS device exceeded 18 volts. Both test devices had mask gate lengths of 0.25 mil.

Figure A-4 shows the characteristic transistor curves for a PMOS and NMOS test transistor. The average drain current measured for twelve NMOS devices and five PMOS devices was 3.9 mA (1.95 mA/mil gate W) and 1.95 mA (.98 mA/mil gate W) respectively. These values reflect a biasing condition of $V_{GS} = V_{DS} = 10$ volts. These currents fall within the range of those attainable with the aluminum gate bulk CMOS technology and thus appeared substandard to what should be expected from the silicon gate beam leaded technology.
The logic chain shown in Fig. A-5 consists of five two-input NOR gates plus an inverter. All gates have a mask gate length of 0.25 mil and each of the NOR gates drive one load. Propagation delay measurements were made on this chain for four sets of input conditions. With an input waveform having 15-ns 10 percent to 90 percent edge times, delay times were measured with supply voltages of 10, 12, and 15 V. Also, using an input waveform having 90-ns, 10 percent to 90 percent edge times, the propagation delay was measured with a supply voltage of 10 V. Delay results for these four tests are presented in Fig. A-5.

The results shown in Fig. A-5 indicate that the average stage delay, using a 10-volt supply voltage and 15-ns, 10 percent to 90 percent input pulses, was about 10-12 ns per stage. Increasing the supply voltage to 12 V increased the speed by about 15 percent. The delay measurements at a supply voltage of 15 V averaged 9-10 ns per stage. This was slower than expected and was attributed to increased leakage at a 15-volt supply voltage.

To isolate the sources of delay in this six-stage chain, a computer simulation of the entire logic path was made. Input parameters to the simulation program (such as effective gate length, doping concentrations, mobilities, etc.) were based on information taken from measurements on the test chip, as well as on additional data supplied by SSTC, Somerville. Simulated transistor currents were calibrated to be exactly the same.
Fig. A-4. PMOS and NMOS drain characteristics.
as those measured on the actual chip. Included in the analysis was the resistance associated with the polysilicon interconnect and gates as well as the input protective resistor. The results of the simulation run indicated that of the 11-ns delay associated with each stage, 2-3 ns was attributable to the interconnection resistance-capacitance between stages. An 85% increase in transistor drain current would be required to reduce the average stage delay to 8 ns (including RC delay).

The logic chain shown in Fig. A-6 was used to measure the pair delay of the 1120 two-input NOR gate. Each of the first four stages of this logic chain drives two loads. Pair delay was measured between the second and fourth stage through outputted inverting buffers. Since each output buffer introduced identical delays, they cancelled one another and provided an accurate measure of the pair delay for the 1120 NOR gate. Data from Fig. A-6 indicate the pair delay to be about 25 ns or 12.5 ns per stage. This value of the stage delay agreed favorably with the 11 ns predicted by computer simulation when the additional loading introduced by the output buffer was taken into account.

Two additional sets of data taken on this logic chain are presented in Fig. A-6. One set gives the delay of two 1120 NOR gates plus an inverter. Comparing this data with the pair delay data indicates that the output inverter introduces 8 to 10 ns of delay. The final set of data in Fig. A-6 presents the delay of the six-stage chain comprising five
Fig. A-6. Pair delay and propagation delay of a five two-input NOR plus inverter.

1120 NOR gates plus an output inverter. This test chain is similar to the chain evaluated in Fig. A-5. Minor variations in the measured delays between these two chains were justifiable considering the additional interstage loading associated with the logic chain of Fig. A-6.

Subsequent work on this program included: (1) defining the logic on the twelve-bit adder chip (ATL-064), placing and routing the logic and creating of the 80× artwork masks; (2) evaluating one 20-bit adder hybrid; and (3) evaluating several additional processing batches of ATL-049 adder chips.

Evaluation of the hybrid and the additional batches of ATL-049 adder chips indicated that the silicon-gate beam leaded process had not stabilized to the point where repeatable functional units could be fabricated. Several completely functional adder units were produced which operated at reduced voltages (5-7 volts), however, it was felt that the circuit performance objectives of this program could not be met using the SG-BL approach.