# I. INTRODUCTION

OS circuits that can be implemented using the universal NAND or NOR gates may contain a large number of serially connected NMOS or PMOS transistors if the fan-in is wide. Refer to Fig. 1 for such a wide fan-in logic circuit with n inputs realized in CMOS logic. The main problem associated with such circuits is the large propagation delay. This is due to the large RC time constant associated with charging or discharging the parasitic capacitances at the output node as well as the parasitic capacitances at the internal nodes. Also, the current-driving capability of the transistors degrades due to the reduction of their effective-gate voltages and their drain-to-source voltages in addition to the threshold-voltage increase due to the body effect. To make the matter worse, such gates when realized in logic-circuit families such as domino CMOS and pseudo NMOS suffer from the contention current, thus requiring a large area for the pull-down network (PDN) in order to have an acceptable noise margin and speed. Multi-input exclusive-OR gates that are required in applications such as parity-check and error-correction circuits or some built-in testing circuits and barrel-shifters [1] are types of applications that may include wide fan-in gates.

For realizing wide fan-in NOR gates, it is better to use the pseudo -NMOS logic -circuit family in which the long chain of the PMOS transistors is substituted by an always-activated PMOS transistor. In this paper, the pseudo-PMOS logic-circuit family is adopted in a similar manner in realizing wide fan-in NAND gates in which the long chain of the NMOS transistors is substituted by an always-activated NMOS transistor. The performance of this family is investigated and compared with the conventional CMOS logic. The remainder of this paper is organized as follows: A survey of the previous work related to the problem at hand is presented in Section II. The pseudo-PMOS logic is presented qualitatively in Section III. The circuit design issues and the comparison of this family with the conventional static CMOS realization are presented quantitatively in Section IV. The effects of the technology scaling and the process variations on this family are discussed in Sections V and VI, respectively. The enhancement in performance is verified by simulation in Section VII. Finally, the paper is concluded in Section VIII. 


# II. PREVIOUS WORK

In this section, a review on some of the previous techniques for enhancing the performance of wide fan-in gates is presented. M. M. Khellah et al. [2] proposed a technique to lower both the dynamic-switching power consumption and the time delay of wide fan-in dynamic gates. This technique depends on generating a low swing signal at the output node by charging and discharging a small dummy capacitor. By virtue of the principle of charge sharing, a small swing is created on the gate output; finally, this swing is amplified to full rail using a suitable sense amplifier. Several techniques for reducing the power consumption of wide fan-in gates can be found in [3, 4, and 5]. Lowering the power consumption by reordering schemes is usually associated with a delay penalty as reordering is usually associated with a movement of the inputs that arrive lately farther away from the gate output.

A novel conditional isolation technique for reducing the evaluation time of wide fan-in domino gates was proposed by W. H. Chiu et al. [6]. This technique also reduces both the subthreshold and the gate-oxide leakage currents simultaneously. According to [6], reductions on total static power by 36%, dynamic power by 49.14%, and delay time by 60.27% compared to the conventional domino gate can be achieved. H. Mostafa et al. proposed adopting novel negativecapacitance circuits in order to reduce the delay variability under process variations [7]. According to this technique, the timing yield was improved from 50% to 100% for a 64-input wide dynamic OR gate at the expense of an excess power overhead.

In [8], K. Mohanram et al. proposed the reordering of the inputs exploiting the symmetry of the circuit with respect to their inputs in order to minimize the switching activity and hence the power consumption. An average reduction of 16% was achieved in power consumption using this scheme. This technique allows for a tradeoff between the complexity of the computation and the quality of the final output. Also, in order to reduce the glitching-power consumption, an extra dimension is added to the complexity of the problem (specifically, the pipelining) in order to obtain the inputs of the circuit at nearly the same instant. A. A George et al. achieved a better noise immunity and a reduced leakage current without any degradation in speed for wide fan-in domino gates by comparing the worst-case leakage current of the pull-up network (PUN) with a mirrored version of this current [9].

F. Moradi et al. proposed a technique that acts to enhance the performance of wide fan-in domino gates by employing a footer transistor that is initially off in the evaluation phase, thus reducing leakage [10]. Also, his proposed scheme reduces the contention between the keeper transistor and the PDN during the evaluation phase. K. Rajasri et al proposed a 256-bit comparator by adopting a novel technique called current-comparison domino circuit, thus reducing both the time delay and the leakage-power consumption [11]. Anamika et al. adopted the stacking effect to reduce the leakage-power consumption for wide fan-in circuits [12].

Finally, the reader is referred to [13 and 14] for techniques that depend on novel circuits having the same output as the conventional wide fan-in circuit but with improved performance. In the next section, the pseudo-PMOS logic is presented.


# III.


# THE PSEUDO-PMOS LOGIC-CIRCUIT FAMILY

The idea of the pseudo-PMOS logic is simply as follows: It is well known from DeMorgan's law that
n n A A A A A A + + + = ...... .... 2 1 2 1 (1)
That is, logic "0" is obtained at the output of a circuit if all the inputs are at logic "1" and logic "1" is obtained at the output if any of the inputs is at logic "0." This can be implemented as well known by the series connection of NMOS transistors in the PDN and the parallel connection of PMOS transistors in the PUN. However, the right-hand side of Eq. ( 1) can be implemented simply using a parallel connection of PMOS transistors in the PUN. Now, refer to Fig. 2 for illustration of the pseudo-PMOS logic with n inputs.  If at least one of the inputs is deactivated, then the corresponding PMOS transistor will conduct. Due to the continuous conduction of the NMOS transistor, M N , there is a voltage division between these two devices. The equivalent resistance of M N can be adjusted by properly choosing the biasing voltage, V B , or adjusting its strength through the aspect ratio, (W/L) n , or the threshold voltage, V thn . Now, if all the inputs are activated, all the PMOS devices will be deactivated. Thus, the parasitic capacitance at the output node is discharged with no contention from the PMOS parallel network and the output is at logic "0" as it must be. The main advantage of the pseudo-PMOS logic is that increasing the number of the inputs merely increases the parasitic capacitance at the output node, thus not affecting the performance of this family significantly. On the other hand, increasing the number of the inputs in the CMOS logic has a significantly deleterious effect on its performance; a point that was discussed in the preceding section and is returned to in Section IV.

Instead of the application of a different biasing voltage, V B , a voltage equal to the adopted power supply, V DD , can be applied with either increasing the threshold voltage of M N or lowering its aspect ratio to obtain a larger equivalent resistance at the lower arm of the voltage divider. Lowering the aspect ratio can be implemented by connecting multi transistors in series as the aspect ratio of the transistor equivalent to n serially connected transistors each with aspect ratio (W/L) is (W/nL) [15]. This is done in order to avoid the need to generate a separate voltage. An important note that is worth mentioning here is the proper choice of the threshold voltage of M N , V thn . Increasing this voltage, although certainly slows down the discharging process of C L , makes the equivalent resistance of M N larger. Thus, the output voltage resulting from the voltage division is larger with the result that the output-high level and consequently the logic swing is larger. Also, the low-to-high transition is faster. These contradictions are returned to in Sections IV and VII.

In order to resolve this contradiction, the scheme of Fig. 3 can be used in which two cascaded inverters were added. The benefits gained from adding these two inverters are to obtain a rail-to-tail voltage swing at the output and to reduce the rise and fall times of the output waveform. This in turn reduces the shortcircuit power consumption in the driven stages. If the previously mentioned parameters are properly chosen, then the voltage at the input of the first inverter, V CL , will be relatively high (larger than the threshold voltage of the first inverter, V thinv1 ) in case only one input is deactivated. If more than one input is deactivated, then more than one PMOS device will be activated with the result that the equivalent resistance of the upper part of the voltage divider decreases. The result is that the voltage at the input of the first inverter becomes larger than that of the previous case and also the output voltage becomes at logic "1." The price paid, however, is the dc current drawn through the first inverter, the short-circuit power consumption of the two inverters, and the additional propagation delays of the two added inverters. Also, the change of V CL with respect to the threshold voltage of the first inverter due to the process variations affects the reliability of the scheme; a point that is returned to in Section VI. Throughout this paper, the scheme of Fig. 3 is adopted unless otherwise specified. Note also that in order to inhibit the large power consumption in the standby state due to the continuous current drawn from V DD to ground, the signal, V B , must be connected to the standby signal. Thus, the path from V DD to ground becomes open during the standby interval.

Two important notes are in order here. The first one is that the pseudo-PMOS logic family can be used in realizing any logic circuit with series or parallel connections in the PUNs or PDNs. In this case, the PUN is the same as that of the conventional CMOS logic. The pseudo-PMOS logic is obviously not suitable for realizing logic circuits containing serially connected PMOS transistors in their PUNs as it requires a significant area overhead. The second note is that the quantitative analysis of the next section can be applied equally well to the pseudo-NMOS logic after substituting the acronyms associated with the NMOS devices by those of the PMOS ones and vice versa. In this section, the circuit design issues of the pseudo-PMOS logic are discussed quantitatively from six aspects. The first one is the proper choice of the strength of the NMOS transistor, M N , and the threshold voltage of the first inverter, V thinv1 (if used). The second, third, fourth, and fifth aspects concern the comparisons between the pseudo-PMOS logic and the conventional CMOS logic from the points of view of the area, the average propagation delay, the average power consumption, and the logic swing. Finally, a figure of merit that includes these metrics is defined and adopted in comparing the performance of the pseudo-PMOS logic and conventional CMOS logic.


# a) The Proper Choice of the Strength of the NMOS Transistor

In determining the proper range for the values of V thinv1 , (W/L) n , V thn , and V B , we adopt the worst-case scenario. The worst-case scenario is the assumption of only one deactivated input because it represents the minimum strength for the PMOS parallel combination and thus the highest equivalent resistance. If the pseudo-PMOS logic operates properly under this condition, it can be ensured to operate properly for all possible input combinations.

Refer now to Fig. 4 for this scenario. M N operates in the saturation region for typical values of the adopted NMOS-transistor parameters in the worst case just described. Since V CL1 , the final steady-state voltage across C L (the parasitic capacitance at the input of the first inverter), is chosen to be larger than V thinv1 , it is expected to be larger than V DD /2. Thus, for the typical values of the PMOS-transistor parameters, M P is expected to operate in the triode region as its V D (drain voltage) is larger than its V G + |V thp |, where V G and V thp are the gate and threshold voltages of M P , respectively. If the PMOS device is assumed to operate in the deeptriode region, then after equating the currents of the NMOS and PMOS devices in which the Shichman-Hodges square-law MOSFET model is adopted [16], we obtain
( ) ( ) ( )( ) 1 ' 1 2 ' 1 2 1 CL DD thp DD p p CL n thn B n n V V V V L W k V V V L W k ? ? ? ? ? ? ? ? = + ? ? ? ? ? ? ? ? (2)
where k n ', (W/L) n , and V thn are the processtransconductance parameter, the aspect ratio, and the threshold voltage of NMOS devices, and k p ', (W/L) p , and V thp are their PMOS counterparts. ? n is the channellength modulation effect parameter of NMOS devices. After simple mathematical manipulations, we readily obtain
( ) ( ) ( ) ( ) thp DD p p n thn B n n thn B n n DD thp DD p p CL V V L W k V V L W k V V L W k V V V L W k V ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ' 2 ' 2 ' ' 1 2 1 2 1 ? (3)
Alternatively, each of the NMOS and PMOS devices, M N and M P , can be replaced by its equivalent resistance. The equivalent resistance, R MP , of M P in the deep-triode region is [17] ( ) ( )
thp DD p p thp SG p p MP V V L W k V V L W k R ? ? ? ? ? ? ? = ? ? ? ? ? ? ? = ' ' 1 1 (4)
However, the equivalent resistance of M N , let it be R MN , can be written as the ratio between the average drain-tosource voltage and the average drain current, thus
( ) ( ) ( ) ( ) ( ) 1 2 ' 1 1 2 ' 1 1 2 1 0 1 2 1 2 1 0 2 1 CL n thn B n n CL CL n thn B n n CL MN V V V L W k V V V V L W k V R ? ? + ? ? ? ? ? ? ? = ? ? ? ? ? ? + + ? ? ? ? ? ? ? + = (5)
The voltage, V CL1 , can be found simply from the voltage division between R MN and R MP as follows:
MP MN DD MN CL R R V R V + = 1 (6)
After substituting by R MN and R MP into Eq. ( 6), the expression for V CL1 can be obtained. Now, putting V CL1 larger than V thinv1 results in the following inequality (from which the strength of the NMOS device can be determined):
( ) ( ) ( ) ( ) 1 ' 2 ' 2 ' ' 2 1 2 1 thinv thp DD p p n thn B n n thn B n n DD thp DD p p V V V L W k V V L W k V V L W k V V V L W k > ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? (7)
V thinv1 in turn can be evaluated from [17] [ ]
1 ' 1 ' 1 1 1 ' 1 ' 1 1 | | n n p p thn thp DD n n p p thinv L W k L W k V V V L W k L W k V ? ? ? ? ? ? ? ? ? ? ? ? + + ? ? ? ? ? ? ? ? ? ? ? ? ? = (8)
where (W/L) n1 , (W/L) p1 , V thn1 , and V thp1 are the aspect ratios and the threshold voltages of the constituting NMOS and PMOS transistors of the first inverter, respectively.

Before leaving this subsection, an important note follows: It is obvious from the qualitative discussion of the pseudo-PMOS logic that increasing the strength of the NMOS device, M N , causes the low-to-high and the high-to-low propagation delays to increase and decrease, respectively, i.e. to change in opposite directions. Thus, it can be concluded that there is an optimum value for the parameters determining this strength such as the threshold voltage and the aspect ratio at which the average propagation delay is at its minimum. This is really the case and this point is confirmed in Subsection C.


# b) The Area Comparison

In comparing the areas of the pseudo-PMOS logic with the CMOS logic, we adopt the approximation that the area of a certain transistor is equal to its channel area [17]. Adopting the convention that the size of the PMOS transistor is twice that of the NMOS one in order to compensate for the mobility difference and that each of the n NMOS transistors in the series connection has an aspect ratio of n in order to compensate for the degradation in delay [17], then the areas of the conventional and the proposed logic-circuit families, A c and A p , can be approximated by ( ) The plots of A c and A p versus n for W = L = 45 nm are shown in Fig. 5. It can be concluded from this rough estimation of the area that the area overhead of the two-cascaded inverters is justified when the number of the inputs exceeds 2. Had we adopted the version of Fig. 2 for the pseudo-PMOS logic, the area of this family would have been smaller than that of the CMOS logic for all values of n.
n n WL A c 2 2 + = (9) ( )WL n A p 7 2 + = (10)

# c) The Average Propagation-Delay Comparison

The average propagation delay according to the pseudo-PMOS logic is defined as
2 PHLp PLHp pavgp t t t + = (11)
where t PLHp and t PHLp are the low-to-high and the high-tolow propagation delays according to the pseudo-PMOS logic, respectively. To determine t PLHp , refer to the circuit shown in Fig. 4. This circuit represents the worst case from the point of view of the delay also as the charging current of C L is the smallest one and thus the estimated value of t PLHp is the largest one. The time delay, t PLHp , contains there subcomponents, t PLHp1 , t PLHp2 , and t PLHp3 , respectively. These are the time delays required to precharge C L to a certain steady-state value that depends on the relative strengths of the activated PMOS device and the always activated NMOS device, the highto-low propagation delay of the first inverter, and the low-to-high propagation delay of the second inverter, respectively. The first subcomponent can be approximated by [17] 
chavg CL L PLHp i V C t ? = 1 (12)
where Î?"V CL is the voltage change of V CL and i chavg is the average charging current of C L . To determine C L , we adopt the assumption that the aspect ratio of the PMOS device is twice that of the NMOS one in order to compensate for the difference in their mobilities and assume that the parasitic capacitance associated with each terminal of the minimum-sized NMOS transistor is C [18], then C L can be approximated as
C L W n C n L ? ? ? ? ? ? ? ? ? ? ? ? + + = 2 3 (13)
where (W/L) n is the aspect ratio of M N . Î?"V CL is equal to 0.5V CL1 (adopting the 50% criterion) and i chavg can be found from
MNavg MPavg chavg i i i ? = , (14)
where i MPavg and i MNavg are the average currents of M P and M N , respectively. The last two currents can be found as follows:
( ) ( ) 2 0 1 CL CL MP CL MP MPavg V atV i atV i i = + = = (15) ( ) ( ) ( ) ( ) ( ) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + + ? ? ? ? ? ? ? = ? 2 1 1 ' 2 ' 2 1 1 2 1 2 1 CL DD CL DD thp DD p p DD p thp DD p p MPavg V V V V V V L W k V V V L W k i ? (16)
i MNavg is given by The number of the inputs, n. The approximated areas of the conventional CMOS logic and the proposed scheme, Ac and Ap, in m2.
( ) ( ) 2 0 1 CL CL MN CL MN MNavg V atV i atV i i = + = = (17)

# conventional CMOS

proposed scheme
© 2017 Global Journals Inc. (US) ( ) ( ) 1 2 ' 1 4 1 CL n thn B n n MNavg V V V L W k i ? + ? ? ? ? ? ? ? = ? . (18)
Substituting by these two currents into Eqs. ( 12) and ( 14) results in
( ) ( ) ( ) ( ) ( ) ( ) 1 1 2 ' 4 1 2 1 2 1 1 ' 1 2 ' 2 1 2 1 1 5 . 0 2 3 1 CL V n thn V B V n L W n k CL V DD V CL V DD V thp V DD V p L W p k DD V p thp V DD V p L W p k CL V C n L W n PLHp t ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + + = (19)
The other two subcomponents are given by
avg out out PLHp i V C t ? = 2 (20)
where out C , out V , and i avg are the parasitic capacitance at the output of the first inverter, the voltage at the output of the first inverter, and the average discharging current of out C , respectively. Keeping in mind that the logic "1" feeding the first inverter is V CL1 , then
( ) ( ) DD V n thn V CL V n L W n k avg i ? + ? ? ? ? ? ? ? ? ? = 1 2 1 1 1 ' 4 1 (21)
out C and out V are equal to 6C and V DD , respectively. So, ( ) ( )
DD n thn CL n n DD PLHp V V V L W k CV t ? + ? ? ? ? ? ? ? = 1 4 1 6 2 1 ' 2 (22)
t PHLp3 is given by [17] ( )
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? = DD thp DD thp DD thp thp DD p p out PLHp V V V V V V V V L W k C t 4 3 ln 2 1 2 ' 3 (23)
C out is the parasitic capacitance at the output of the second inverter and is given by 3C + C fan , where C fan is the parasitic capacitance due to the fan-out. Now, t PHLp contains three subcomponents also; t PHLp1 , t PHLp2 , and t PHLp3 which are the time delays required to discharge C L from V CL1 to 0 V, the low-to-high propagation delay of the first inverter, and the high-tolow propagation delay of the second inverter, respectively. t PHLp1 can be found from
disavg CL L PHLp i V C t ? = 1 (24)
where i disavg is the average discharging current through M N and is given by ( ) ( )
2 0 1 = + = = CL MN CL CL MN disavg atV i V atV i i (25) ( ) ( ) 1 2 ' 1 4 1 CL n thn B n n disavg V V V L W k i ? + ? ? ? ? ? ? ? = ? (26)
So, t PHLp1 is given by ( ) ( )
1 1 2 ' 1 2 1 CL V n thn V B V n L W n k CL V L C PHLp t ? + ? ? ? ? ? ? ? ? ? = (27)
t PHLp2 and t PHLp3 were estimimted in [17] and were found to be respectively.

( )
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? = DD thp DD thp DD thp thp DD p p out PHLp V V V V V V V V L W k C t 4 3 ln 2 1 2 ' 2 (28) and ( ) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? = DD thn DD thn DD thn thn DD n n out PHLp V V V V V V V V L W k C t 4 3 ln 2 1 2 ' 3(29)
As discussed in Subsection A and illustrated in Figs. 6 and 7, there is an optimum value for the aspect ratio and the threshold voltage of M N at which t pavgp is at its minimum. Figs. 6 and 7 show the plots of the average propagation delay of the pseudo-PMOS logic versus the aspect ratio and the threshold voltage of M N , respectively, according to the scheme of Fig. 3 with V thn = 0.25 V, V thp = -0.32 V, V DD = 0.8 V, and C = 1 fF. As shown in these two figures, the optimum average propagation delay occurs at (W/L) n and V thn equal to 5.2 and 0.58 V, respectively.  Now, refer to Fig. 8 for the plots of the average propagation delay versus the number of the inputs according to the analysis and the simulation results adopting the scheme of Fig. 2 and estimating the time delays up to the 50% point.


# Fig. 8:

The average propagation delay versus the number of the inputs according to the analysis and the simulation results. Now, the average propagation delay according to the CMOS logic, t pavgc , can also be defined in a similar way as the average of the low-to-high and the high-tolow propagation delays, t PLHc and t PHLc , respectively, as follows:
2 PHLc PLHc pavgc t t t + = (30)
Toward a simplified evaluation for these two time delays, each NMOS and PMOS transistor in the CMOS-logic circuit is represented by an equivalent resistance, R N and R P , respectively. In [19], approximate expressions for the equivalent resistances of the NMOS and PMOS transistors are:
( ) n n N L W R ? = (31) and ( ) p p P L W R ? = (32) ( ) ( ) ( ) ( ) ( ) [ ] n N P N P N P P PLHc C R n R C R R C R R C R t 1 .... 2 2 ln 3 2 1 ? + + + + + + + = (33) and ( ) ( ) ( ) [ ] n N N N N PHLc C R C R n C R n C nR t + + ? + ? + = .... 2 1 2 ln 3 2 1 (34)
According to the estimation of the parasitic capacitances adopted in this paper, we have: 
C 1 = C fan + 3nC, C 2 = C 3 = ?. = C n = 2nC.

# d) The Average Power-Consumption Comparison

The average power consumption is the average of the power consumptions in cases of low-to-high and high-to-low transitions. The power consumption of the pseudo-PMOS logic contains the static and the dynamic power components. The static-power consumption is that associated with the first inverter due to the activation of its two devices in case of low-to-high transition and due to the current drawn through the activated PMOS devices and the always activated NMOS device, M N , in case of low-to-high transition also. The input voltage of the first inverter certainly depends on the number of the activated inputs. So, we, in order to simplify the dc-power estimation, assume that the first-inverter's input is at V DD /2 and that this inverter is matched so that its output will also be at V DD /2. The estimated dc-power consumption according to this evaluation is certainly overestimated as the dc current of the first inverter is at its maximum when the inverter's respectively, where ? n and ? p are process-dependent parameters for the NMOS and PMOS devices, respectively. Simulation results reveal that the best estimates for ? n and ? p are 2.5 k ? and 18 k?, respectively, for the 45 nm CMOS technology. We adopt the worst case in estimating the low-to-high propagation delay in that only one input is assumed to be at logic "0" and corresponds to the lowermost NMOS transistor so that all the internal capacitances will be charged. Applying Elmore's delay formula [20 and 21] to the conventional CMOS circuit shown in Fig. 1 results in the following estimations for t PLHc and t PHLc : input is at V DD /2 [17]. Thus, the range of the number of the inputs over which the pseudo-PMOS logic is better than the CMOS logic is expected to be larger than that estimated. In case of low-to-high transition, the dc current of the first inverter is (where LH indicates low-tohigh transition)
? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? = 2 1 2 2 1 2 ' DD n thn DD n n DCLH V V V L W k I ? (37)
The dc-power consumption of the first inverter in the low-to-high transition is thus
? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? = 2 1 2 2 2 ' 1 DD n thn DD n n DD DCLH V V V L W k V P ? (38)
Similarly, the dc-power consumption through M N can be written as
( ) ( ) 1 2 ' 2 1 2 CL n thn B n n DD DCLH V V V L W k V P ? + ? ? ? ? ? ? ? = (39)
In the other case (high-to-low transition), all the inputs are activated with the result that all the PMOS devices in the parallel connection become equivalent to an open circuit. So, there is no dc power consumption in this transition (HL indicates high-to-low transition),
0 = DCHL i (40) 0 = DCHL P (41)
The average dc power consumption is thus
( ) ( ) ? ? ? ? ? ? ? ? + ? + ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? = + = 1 2 2 ' 1 2 1 2 4 2 CL n thn B DD n thn DD n n DD DCHL DCLH DC V V V V V V L W k V P P P ? ? (42)
The average dynamic-switching power consumption is the average of that associated in cases of low-to-high and high-to-low transitions. In case of worst-case low-to-high transition, all except one of the inputs are activated, thus the dynamic-switching power consumption associated with the charging of the parasitic capacitances at the input of the first inverter, the output of the second inverter, and the gate terminals can be written as
( ) [ ]C n fV C V fV C fV P DD L CL DD out DD dLHp 1 2 2 1 2 ? + + = ? ? ? (43)
where ? is the switching activity and f is the frequency of operation. The corresponding value in case of high-tolow transition is
( )C n fV C fV P DD out DD dHLp 2 2 2 ? ? + = (44)
Thus, the average dynamic-switching power consumption associated with the parasitic capacitances at the previously mentioned nodes is
2 dHLp dLHp davgp P P P + = (45)
Note that all these capacitances have the same switching activities and the same frequencies of operation. The second type of the dynamic-power consumption is the short-circuit power consumption associated with the two inverters, P sc1avg and P sc2avg . Assuming that the two inverters are matched, then the average short-circuit power consumption is [21] ( )
12 2 3 1 ' 1 thn DD n n scavg V V f L W k P ? ? ? ? ? ? ? = ? ? (46)
where ? 1 is the rise or fall time of the voltage that feeds the first inverter and is equal to twice the average of the low-to-high and the high-to-low propagation delays at the input of the first inverter. P scavg2 can be written as ( )
12 2 3 2 ' 2 thn DD n n scavg V V f L W k P ? ? ? ? ? ? ? = ? ? (47)
where ? 2 is the rise or fall time of the voltage that feeds the second inverter and is equal to twice the average of the low-to-high and the high-to-low propagation delays at the input of the second inverter. Certainly, the shortcircuit power consumption is zero in case V DD is smaller than V thn + |V thp | as there is no time interval during which both the NMOS and PMOS devices conduct simultaneously [21].

The average dynamic-switching power consumption of the CMOS logic is also taken as the average of that in cases of low-to-high and high-to-low transitions. Adopting the worst case in case of low-tohigh transition (illustrated in Subsection C), the power required to charge the internal capacitances at the gate terminals as well as at the internal nodes is
( ) ( ) [ ] [ ] n C C C DD fV n nC n C DD fV dLHc P + + + + ? + ? = ... 2 1 2 1 1 2 2 ? ? . ( )( ) [ ] 2 3 1 3 2 + ? + + = n n C nC fan C DD fV ? (48)
During the high-to-low transition, all the inputs are activated and thus the associated power consumption is 
? ? ? ? ? ? + = nC C n DD fV dHLc P

# e) The Logic Swing

The logic swing, LS, at the output node is simply equal to the difference between the output high and low levels. Since C L is discharged to 0 V with no contention from the PMOS device, LS is equal to V CL1 . Refer to Fig. 9 for the plots of LS versus V thn according to the analysis and the simulation results (of Fig. 2). Certainly, the logic swing of the CMOS logic is V DD .


# Fig. 9:

The change of the logic swing with the threshold voltage of M N , V thn , according to the analysis and the simulation results.


# f) The Figure of Merit

In order to evaluate the performance of the pseudo-PMOS logic compared to the CMOS logic, we define a figure of merit, FOM, that includes the four previously estimated metrics; the area, the average propagation delay, the average power consumption, and the logic swing. Since these four metrics are preferred to be at their minimum except the logic swing which is preferred to be at its maximum, the FOM is defined as follows according to the conventional and proposed logic-circuit families: 10 MHz. As shown, the performance of the pseudo-PMOS logic is better than that of the CMOS logic when n exceeds 8. This can be attributed to the degradation of the performance of the CMOS logic when n exceeds this value due to the limitations discussed in Section I. However, the matter is not that bad with the pseudo-PMOS logic as mentioned in Section III in which the degradation is merely due to the increased number of the PMOS transistors and the associated parasitic effects at the output node. The same can be said about Fig. 11 in which the performance of the conventional CMOS logic degrades faster than that of the pseudo-PMOS logic with increasing f. This is certainly due to the need to deal with numerous parasitic capacitances in the conventional CMOS-logic realization. According to Fig. 11, the pseudo-PMOS logic is better than the conventional CMOS when f exceeds 15 MHz. According to Fig. 12, it is apparent that the pseudo-PMOS logic exhibits an optimum behavior versus V DD due to the obvious conflictions associated with changing V DD with the optimum performance occurring at V DD = 0.7685 V. Specifically, increasing V DD enhances the logic swing and the propagation delay; however, at the expense of worsening the power consumption. In a nutshell, the pseudo-PMOS logic has a smaller area and average propagation delay but larger power consumption and slightly smaller noise margin compared with the CMOS logic.  respectively. Thus, the larger the FOM, the better the performance will be. Refer to Figs. 10, 11, and 12 for the plots of the figures of merit of the CMOS logic and the pseudo-PMOS logic according to the version of Fig. 2 versus n, f, and V DD , respectively. The parameters adopted with these plots are n = 8, C fan = 10 fF, and f = Fig. 12: The plot of FOM p versus the power-supply voltage, V DD .

V.


# EFFECT OF TECHNOLOGY SCALING

In this section, the effect of technology scaling on the pseudo-PMOS logic is investigated. The following effects are investigated: velocity saturation, mobility degradation, reduction of the V DD /V thn ratio, increased process variations, and channel-length modulation.


# a) Velocity Saturation and Mobility Degradation

These two effects act to reduce the current of the MOS transistor for the same applied voltages. Thus, the time required to develop a certain voltage at the firstinverter input or at the output of the circuit increases. However, this effect is common in both the CMOS logic and the pseudo-PMOS logic. It must be noted that the mobility-degradation effect is more pronounced in NMOS transistors compared to PMOS ones. Thus, the sizing of the PMOS transistors is expected to decrease with technology scaling. Since most of the area of the pseudo-PMOS logic is due to the PMOS network, the area advantage becomes more pronounced.


# b) Reduction of the V DD /V thn Ratio

Due to the performance degradation with reducing V DD , V thn also reduces with technology scaling but at a smaller rate, thus the ratio, V DD /V thn , is expected to decrease with technology scaling. This has the effect of reducing the short-circuit power consumption which enhances the performance of the pseudo-PMOS logic.


# c) Increased Process Variations

The effect of the process variations increases with technology scaling. This has the effect of narrowing the range within which the threshold voltage of the first inverter lies for the proper operation of the pseudo-PMOS logic. This seems to be the most important degradation associated with the pseudo-PMOS logic as it reduces its reliability if the version of Fig. 3 were used. This effect, however, has no counterpart in the CMOS logic.


# d) The Channel-Length Modulation Effect

The Early voltage modeling the dependence of the drain current on the drain-to-source voltage is proportional to the channel length [17]. So, it decreases with technology scaling with the result that the drain current increases. This effect is more pronounced in the CMOS logic due to the division of the voltage across the stacked devices which has no counterpart in the pseudo-PMOS logic. Also, the slope of the voltagetransfer characteristics of the inverters in the transition region decreases. So, for a certain difference that is developed at the inverter input, there is a smaller value for the logic swing at the inverter output. This indicates that a smaller propagation delay is associated with the two cascaded inverters.


# VI. EFFECT OF PROCESS VARIATIONS

In this section, the effect of the process variations on the reliability of the pseudo-PMOS logic is investigated quantitatively. Specifically, the variations of the aspect ratio and the threshold voltage of the NMOS and PMOS devices composing the voltage divider are taken into account with their effects on the first inverter's input voltage quantified. The equation describing the voltage at the input of the first inverter was derived in Section IV and is repeated here for convenience as follows:

(
thp DD p p n thn B n n thn B n n DD thp DD p p CL V V L W k V V L W k V V L W k V V V L W k V ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ' 2 ' 2 ' ' 1 2 1 2 1 ? (52)) ( ) ( ) ( )
Let the variation in the aspect ratio of the NMOS device be Î?"(W/L) n , then after substituting (W/L) n by (W/L) n + Î?"(W/L) n into Eq. (52), neglecting the terms containing (Î?"(W/L) n ) 2 , using the approximation
x x ? ? + 1 1 1 for x << 1(53)
and performing some algebraic manipulations, we get the percentage variation of the voltage, V CL1 , due to the change of (W/L) n (let it be Î?"V CL11 /V CL1 ) as shown in Eq. ( 54).

The percentage variations of V CL1 due to each of Î?"V thn , Î?"(W/L) p , and Î?"V thp can be evaluated in a similar manner and shown to be (let them be Î?"V CL12 /V CL1 , Î?"V CL13 /V CL1 , and Î?"V CL14 /V CL1 , respectively) as shown in Eqs. (55), (56) and (57).

Refer to Figs. 13, 14, 15, and 16 for the plots of the percentage variations of V CL1 due to that in (W/L) n , V thn , (W/L) p , and V thp , respectively. It is obvious that the variation in the threshold voltage of M P and M N has the largest effect on V CL1 . If these variations cannot be tolerated, the two inverters must be dispensed and the scheme of Fig. 2 can instead be adopted. 
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? thp DD p p n thn B n n n thn B n thn B n n DD thp DD p p thn B n n CL CL V V L W k V V L W k V V k V V L W k V V V L W k V V k L W V V ' 2 ' 2 ' 2 ' ' 2 ' 1 11 2 1 2 1 2 1 2 1 ? ? (54) ( )( ) ( ) ( ) ( ) ( ) ( )? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? thp DD p p n thn B n n n thn B n n thn B n n DD thp DD p p thn B n n thn CL CL V V L W k V V L W k V V L W k V V L W k V V V L W k V V L W k V V V ' 2 ' ' 2 ' ' ' 1 12 2 1 2 1 ? ? (55) ( ) ( ) ( ) ( ) ( ) ( ) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? thp DD p p n thn B n n thp DD p thn B n n DD thp DD p p DD thp DD p p CL CL V V L W k V V L W k V V k V V L W V V V L W k V V V k L W V V ' 2 ' ' 2 ' ' ' 1 13 2 1 2 1 ? (56) ( ) ( ) ( ) ( ) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? = ? thp DD p p n thn B n n p p thn B n n DD thp DD p p DD p p thp CL CL V V L W k V V L W k L W k V V L W k V V V L W k V L W k V V V ' 2 ' ' 2 ' ' ' 1 14 2 1 2 1 ? (57)( ) ( ) ( ) ( ) ( ) ? ?
Fig. 15: The percentage variation of the voltage, V CL1 , versus that of the aspect ratio of M P . 


# SIMULATION RESULTS

In this section, the pseudo-PMOS logic (Fig. 2) is verified by simulation adopting the 45 nm CMOS technology with V DD = 0.8 V [22]. Assume minimumsized devices and a frequency of operation equal to 500 MHz. As a compromise between the enhancement of the logic swing and the degradation in the high-to-low propagation delay with increasing V thn , M N is chosen with a threshold voltage equal to 0.7 V and biased by V B = V DD . In evaluating the low-to-high or the high-to-low propagation delays, the 50% criterion is adopted. The worst-case scenarios are also adopted.

Refer to Figs. 17, 18, and 19 for the low-to-high, the high-to-low, and the average propagation delays, respectively, versus the number of the inputs, n, for the conventional and pseudo-PMOS logic families. The lowto-high propagation delay according to the pseudo-PMOS logic is found to be smaller than that of the conventional one for all values of n. The superiority in performance of the pseudo-PMOS logic during the lowto-high transition is attributed to the need to charge all the internal capacitances of the pull-down network in the conventional CMOS stack. Although the contention current of M N slows down the charging of C L in the pseudo-PMOS logic, it does not affect the performance considerably.

On the other hand, the high-to-low transition of the pseudo-PMOS logic is faster than that of the conventional CMOS logic when n exceeds 4. The average propagation delay of the pseudo-PMOS logic is smaller than that of the conventional CMOS logic when n exceeds 3. Finally, note that the degradation in the logic swing compared to the conventional CMOS logic is approximately 63 mV, i.e. only 7.8%, in the worst case.     The number of the inputs, n.

The average propagation delays according to the proposed and conventional schemes in Seconds. The number of the inputs, n. The average power consumption according to the proposed and conventional schemes in Watts.


# conventional CMOS proposed scheme

Fig. 21: The plots of the average power-delay products according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.


# Fig. 22:

The plots of the average energy-delay products according to the conventional CMOS logic and the pseudo-PMOS logic versus the number of the inputs.

The average power consumption, the average power-delay products (PDPs), and the average energydelay products (EDPs) are plotted versus n in Figs. 20, 21, and 22, respectively, for the conventional and proposed logic families. The average power consumption of the CMOS logic rises with n at a faster rate compared to that of the pseudo-PMOS logic due to the need to charge the internal capacitances of the CMOS logic circuit. The PDP and the EDP, however, of the pseudo-PMOS logic are smaller than their counterparts of the CMOS logic when n exceeds 6 and 5, respectively. VIII.


# CONCLUSIONS

The pseudo-PMOS logic family was adopted for realizing wide fan-in CMOS circuits containing long stacks of NMOS transistors. The area, propagation delay, power consumption, power-delay and energydelay products of this family were compared with those of the conventional CMOS The pseudo-PMOS logic showed superior performance from the points of view of the average propagation delay, power-delay product, and energy-delay product when the number of the inputs exceeds 3, 6, and 5, respectively. In fact, the pseudo-PMOS logic had a smaller area and average propagation delay but larger average power consumption and slightly smaller noise margin (by about 7.8% in the worst case compared to the conventional CMOS logic). According to the estimation performed in this paper using a proper figure of merit, the pseudo-PMOS logic is better than the CMOS logic when the number of the inputs exceeds 8.


			© 2017 Global Journals Inc. (US)
			Year 2017 F A Pseudo-PMOS Logic for Realizing Wide Fan-in NAND Gates
			A Pseudo-PMOS Logic for Realizing Wide Fan-in NAND Gates
		
		
* 
	
		
			KMartin
		
		Digital Integrated Circuit Design
				New York
		
			Oxford University Press
			2000
		
	
* 
	
		Use of Charge Sharing to Reduce Energy Consumption in Wide Fan-In Gates
		
			MMKhellah
		
		
			MIElmasry
		
	
		Proceedings of the IEEE International Symposium on Circuits and Systems
				the IEEE International Symposium on Circuits and Systems
		
			31 May -3 1998
		
	
* 
	
		Transistor Reordering for Power Minimization under Delay Constraint
		
			SCPrasad
		
		
			KRoy
		
	
		ACM Transactions on Design Automation Elect. Syst
		
			1
			2
			
			Apr. 1996
		
	
* 
	
		Optimizing CMOS Circuits for Low Power Using Transistor Reordering
		
			EMusoll
		
		
			JCortadella
		
	
		Proceedings of European Design and Test Conference
				European Design and Test Conference
		
			1996
			
		
* 
	
		Minimization of Power in VLSI Circuits Using Transistor Sizing, Input Ordering, and Statistical Power Estimation
		
			CTan
		
		
			JAllen
		
	
		Proceedings of International Workshop Low-Power Design
				International Workshop Low-Power Design
		
			1994
			
		
* 
	
		A Conditional Isolation Technique for Low-Energy and High-Performance Wide Domino Gates
		
			WHChiu
		
		
			HRLin
		
	
		IEEE Region 10 Conference
				
			30 Oct. -2 Nov. 2007
		
	
* 
	
		Novel Timing Yield Improvement Circuits for High-Performance Low-Power Wide Fan-In Dynamic OR Gates
		
			HMostafa
		
		
			MAnis
		
		
			MElmasry
		
	
		IEEE Transactions on Circuits and Systems I: Regular Papers
		
			58
			8
			
			Aug. 2011
		
	
	References Références Referencias


* 
	
		Current Comparison Based High Speed Domino Circuits
		
			AAGeorge
		
		
			ARSankar
		
	
		National Conference on Science, Engineering, and Technology (NCSET)
		
			4
			6
			
			2016
		
	
* 
	
		High Speed and Leakage-Tolerant Domino Circuits for High Fan-in Applications in 70nm CMOS Technology
		
			FMoradi
		
		
			DTWisland
		
		
			HMahmoodi
		
		
			TVCao
		
	
		Proceedings of the 7 th International Caribbean Conference on Devices, Circuits, and Systems
				the 7 th International Caribbean Conference on Devices, Circuits, and SystemsMexico
		
			28 -30 Apr., 2008
		
	
* 
	
		Low Leakage High Speed Domino Circuit For Wide Fan-in Equality Comparator
		
			KRajasri
		
		
			MManikandan
		
		
			AJDhanaseely
		
		
			MNishanthi
		
	
		International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
		
			4
			3
			Mar. 2015
		
	
* 
	
		Low Power Wide Fan In Domino OR Logics
		
			GNAnamika
		
		
			Chiranjeevi
		
	
		International Journal of Current Engineering and Scientific Research
		
			3
			
			2016
		
	
	IJCESR)


* 
	
		Novel Single and Double Output TSC Berger Code Checkers
		
			XKavousianos
		
		
			DNikolos
		
	
		of VLSI Test Symposium
				
			1998
			
		
* 
	
		Tree Checkers for Applications with Low Power-Delay Requirements
		
			CMetra
		
		
			MFavalli
		
		
			BRicco
		
	
		Proceedings of International Symposium on Defect and Fault Tolerance VLSI Systems
				International Symposium on Defect and Fault Tolerance VLSI Systems
		
			1996
			
		
* 
	
		Design of Analog CMOS Integrated Circuits, Second Edition
		
			BRazavi
		
		
			2016
			McGraw-Hill
			New York
		
	
* 
	
		Modeling and Simulation of Insulated-Gate Field-Effect Transistor Switching Circuit
		
			HShichman
		
		
			DHodges
		
	
		IEEE Journal of Solid-State Circuits
		
			13
			3
			
			Sep. 1968
		
	
* 
	
		
			ASSedra
		
		
			KCSmith
		
		Microelectronic Circuits, Seventh Edition
				New York
		
			Oxford University Press
			2015
		
	
* 
	
		
			NH EWeste
		
		
			DMHarris
		
		CMOS VLSI Design: A Circuits and Systems Perspective, Fourth Edition
				Massachusetts, USA
		
			Addison-Wesley
			2011
		
	
* 
	
		Analysis and Design of Digital Integrated Circuits
		
			DAHodges
		
		
			HGJackson
		
		
			RASaleh
		
	
		Deep Submicron Technology
				Singapore
		
			McGraw Hill
			2004
		
	
	Third Edition


* 
	
		The Transient Response of Damped Linear Networks with Particular Regard to Wideband Amplifiers
		
			WCElmore
		
	
		Journal of Applied Physics
		
			19
			
			Jan. 1948
		
	
* 
	
		
			JEAyers
		
		Digital Integrated Circuits: Analysis and Design
				Boca Raton, USA
		
			CRC Press
			2005
		
	
* 
	
		
		Predictive Technology Model (PTM)
				
	
* 
	
		Ordering
	
	
		IEEE Transactions on Very Large Scale Integration (VLSI) Systems
				
			Nov. 2004
			12