An Approach for Self-Timed Synchronous CMOS Circuit Design

Alvernon Walker  Parag K. Lala
alvernon@ncat.edu  lala@ncat.edu
Department of Electrical Engineering
North Carolina A&T State University
Greensboro, NC 27411

Abstract

In this letter we present a timing and control strategy that can be used to realize synchronous systems with a level of performance that approaches that of asynchronous circuits or systems. This approach is based upon a single-phase synchronous circuit/system architecture with a variable period clock. The handshaking signals required for asynchronous self-timed circuits are not needed. Dynamic power supply current monitoring is used to generate the timing information, that is comparable to the completion signal found in self-timed circuits; this timing information is used to modify the circuit clock period. This letter is concluded with an example of the proposed approach applied to a static CMOS ripple-carry adder.

I. Introduction

A mechanism that controls logical ordering of circuit switching events and insures that circuit physical timing constraints are met is required for proper operation of sequential circuits/systems [1,2]. Three general approaches are used to realize this mechanism in sequential circuits and systems. In the most prevalent approach the event control signals are stored in circuit memory elements which are simultaneously updated with one or several globally distributed periodic signals that has a period larger than the circuit's longest critical path propagation delay. Circuits designed with this timing strategy are called synchronous circuits. A second sequential circuit event ordering and timing approach is based upon the local generation of timing information via the intrinsic propagation delays of active circuit paths [1,2], circuits designed with this timing and control approach are called asynchronous circuits. The third approach is based upon a special class of asynchronous circuits called self-timed circuits, these circuits are modular and utilize handshaking, start and done signaling for intermodule communication and module process initiation and completion.

The peak performance of a sequential circuit/system, e.g. the number of operands processed per unit time, is determined by the timing and control approach. For example the performance of a synchronous circuits is determined by the longest critical path propagation delay because the period of the system/circuit clock must be greater than or equal to the delay along that path [1,2]. The main problem with this approach, with respect to circuit performance, is that the clock period is set to a fixed value that is large enough to insure that the worst-case physical timing constraints are met which sacrifices circuit performance when the longest path is not active. This is not true for asynchronous circuit and systems because the locally generated timing and control signals are determined by the active paths, i.e. the circuits that realize the operator. The main disadvantage of this approach is that complex design techniques are required to realize relatively simple operators[1,2]. The performance of self-timed circuits is comparable to that of asynchronous circuits without the associated complexity and design problems. Although self-timed circuits are not as complex as asynchronous circuits they are far more complex than synchronous circuits due to the redundancy in the datapath that is required to generate the completion or done signal, i.e. a signal that indicates the output of the
module is valid. The complexity of self-timed modules are also a result of the handshaking circuits that are required for intermodule communication [1,2]. In this letter we present a timing and control strategy that can be used to realize synchronous systems with the performance of asynchronous circuits or systems. This approach is based upon a single-phase synchronous circuit/system architecture with a variable period clock. The handshaking signals required for self-timed circuits are not needed because the global clock in synchronous systems coordinates all circuit activity. Dynamic power supply current (DPSC) monitoring is used to generate the timing information, that is comparable to the completion signal found in self-timed circuits, which is used to modify the circuit clock period. This current is used to control the current sources in the ICO (i.e. current controlled oscillator) that determines the circuit's clock frequency.

This letter is organized as follows: section II describes DPSC and its relationship to transitions at the output of gates on the critical path. An example of the proposed technique is given in section III. Our conclusions are presented in section IV.

II. DPSC

The current supplied by the power and ground rail in static CMOS logic circuits can be divided into two general classes: quiescent and dynamic. The steady-state current $I_{ddq}$ that flows from the power rail $V_{dd}$ (or into the ground rail) for a given input vector $<V>$ is called quiescent current. This current is approximately zero in a fault-free circuit [3,4]. The second general class of rail current, DPSC $I_{dp}(t)$, is produced when a transition appears at the output of the gate attached to the power/ground rail for an input pair $<V_1,V_2>$. This current only consists of the rail currents of gates on the circuit critical path. The current supplied by the rail voltage $V_{dd}$ during a low-to-high transition at the output of a static CMOS gate is called the p-block DPSC $I_{dpd}(t)$, where the p-block is an array of interconnected p-channel transistors and the n-block is a group of connected n-channel devices in the static CMOS gate as shown in Fig. 1. The n-block DPSC $I_{dpn}(t)$ is the ground rail current that is produced during a high-to-low transition at the gate output. This current consists of the following supply (or ground) currents: short-circuit current $I_{shor}(t)$, load $C_L$, displacement current $I_d(t)$, gate input capacitance $C_{gs}$, displacement current $I_{eg}(t)$ and block device leakage current $I_{leakage}(t)$ [5].

![Figure 7 Complex Static CMOS Gate](image-url)
Figure 8 Various input and Output Combinations: (a) comparable input and output rise and fall times, (b) fast input, slow output, (c) slow input, fast output.

Figure 9 4-Bit Adder, DPSC monitor and ICO realizations.

The relationship between the DPSC and the gate input and output transitions is shown in Fig. 2 for three input and output combinations: (a) fast input, slow output, (b) slow input, fast output and (c) comparable input fall and output rise time. This figure shows that the DPSC is non-zero during a transition at the output of a gate.
irrespective of the gate input/output rise and fall time characteristics. Hence, the DPSC can be used to detect transitions at circuit output(s) without regards to active path circuit parameters.

III. Example

The DPSC due to its relationship to gate output transitions can be used to determine if the circuit being monitored, has reached steady-state. Therefore, it can be used to indicate that the output of the circuit is valid, i.e. realize a self-timed circuit. For example, if an four-bit ripple-carry adder is implemented as shown in Fig. 3, and the output inverter of the carry generation circuit, power supply and ground rail currents are summed the net DPSC pulse width varies with the number of generated carry bits as shown in Fig. 4. The variations in the clock period and the DPSC is shown in Fig. 4 for the four-bit adder inputs \( A, B \) and \( \text{Carry} \). These inputs are shown at the bottom of Fig. 4 with the corresponding clock period. The off time of the clock, i.e. the time the logic between the interstage latches is generating its output, is increased if a carry is generated by any of the full adders (i.e. FA in Fig. 3). This is done by subtracting the weighted DPSC, shown in Fig. 4, from the charge pump current source that controls the off time of the clock. This reduction in the net current entering the capacitor \( C_1 \) increase the off time. The clock off time in Fig. 4 ranges from 10.2 ns to 43.2 ns (i.e. clock period - 6 nanosecond). The change in the steady-state clock period, i.e. 16.2 ns, is determined by the area under the net DPSC. This is related to circuit timing if the DPSC of the gate along the circuits’ critical is monitored. The off

![Figure 10 Adder Input Latch Output, 4-Bit Ripple Carry Adder Output and associated DPSC](image-url)
time of the ICO in Fig. 3 is

\[ T_{\text{low-time}} = \frac{C_1 V_{\text{th}} + \int [(I_{\text{FDD}}(t) + I_{\text{FND}}(t)) dt]}{I_{\text{down}}} \]  \hspace{1cm} (2)

where,

- \( V_{\text{th}} \) - Schmitt trigger threshold voltage,
- \( I(F_{\text{DD}}(t)) + I(F_{\text{ND}}(t)) \) - net DPSC,
- \( I_{\text{down}} \) - ICO constant current source that determines steady-state off-time.

**IV. Conclusion**

Static CMOS circuit DPSC monitoring can be used to implement synchronous self-timed circuits if the DPSC of the gates on the critical path are monitored. This approach works because the peak value, pulse width and event duration of the DPSC is determined by the number of transitions and their appearance in time on the longest active signal path, i.e. the active subset of the critical path. This technique can be used to determine when the output of the circuit reaches steady-state because transitions occur last on the critical path. The variations in the clock period (i.e. down time) is determined by the area under the DPSC which is related to the three latter characteristics of the DPSC. The performance, i.e. the number of operands produced per unit time by a functional unit, approaches that of an asynchronous self-timed circuit, and the hardware overhead of this approach is substantially lower than asynchronous realizations.

**References**


