## Comparison of 1/4-Micron-Gate Fully-Depleted CMOS/SIMOX and Bulk Gate Arrays for Low-Voltage, Low-Power Applications

# Y. Kado, H. Inokawa, K. Nishimura, Y. Okazaki, M. Sato, T. Ohno, T. Tsuchiya, M. Ino, K. Takeya, and T. Sakai NTT LSI Laboratories 3-1, Morinosato Wakamiya, Atsugi-shi, Kanagawa, 243-01 Japan.

One quarter-um-gate ultrathin-film SIMOX and bulk device technologies with 1.4-um-pitch fully-planarized four-level metallization have been applied to a 40-KG CMOS gate array. A 48 x 48-bit multiplier embedded in the CMOS/SIMOX gate array revealed superior performance for low-voltage operation as compared to bulk CMOS counterpart. The origin of speed gain and power reduction in the SIMOX circuits is verified through the dynamic analysis using a test structure built on the gate array.

#### 1. INTRODUCTION

The low parasitic capacitance of SOI devices makes them very attractive for low-voltage, low-power applications, because reducing parasitic capacitance is a key issue in enhancing the performance of scaled logic LSIs at low supply voltages of 1-2 V. This paper demonstrates the greately improved performance of a 48 x 48-bit multiplier embedded in a CMOS/SIMOX gate array as compared to its bulk CMOS counterpart. To investigate the origin of speed gain and power reduction, the supply voltage dependencies of delay and parasitic capacitance components in the CMOS/SIMOX are compared with those of bulk circuits.

#### 2. GATE ARRAY FABRICATION

We fabricated a 40-KG CMOS gate array using a 1/4-um-gate ultra-thin-film SIMOX [1,2] and bulk device [3] technologies with 1.4-um-pitch fully-planarized four-level metallization. For the SIMOX gate array, a low-dose 6-inch SIMOX wafer with a 90-nm-thick buried oxide layer was used [4]. Fully-depleted N+ poly-Si gate NMOSFETs and P+ poly-Si gate PMOSFETs with 5-nm-thick gate oxides were built on 50-nm-thick SIMOX films. The key device parameters of SIMOX and the bulk devices used in the gate array are summarized in Table 1. For a comparison of circuit performance at the very low supply voltages of from 1 to 2 V, samples with low threshold voltages were selected. A 48 x 48-bit multiplier with a Wallace-tree configuration and test structures, to estimate the switching speed of logic gates, were implemented in the SIMOX and bulk CMOS gate array with the same mask set KrF-excimer-laser lithography was employed for all patternings. A chip photograph of the multiplier fabricated on the gate array is shown in Fig. 1.

## 3. MULTIPLIER PERFORMANCE AT LOW VDD

Figure 2 shows the relationship between measured multiplication time and power consumption for the multiplier built on a gate array using CMOS/SIMOX

| Table                     | e 1 | Key | Device | Parameters |
|---------------------------|-----|-----|--------|------------|
| ACTIVATION AND ADDRESS OF |     |     |        |            |

|        |        | SIMOX      | BULK       |
|--------|--------|------------|------------|
|        | N      | MOS/PMOS   | NMOS/PMOS  |
| Lp     | [um]   | 0.24/0.24  | 0.24/0.30  |
| Wp     | [um]   | 8.05/12.25 | 8.05/12.25 |
| Tox    | [nm]   | 5          | 5          |
| Tsi    | [nm]   | 50         |            |
| Tbox   | [nm]   | 90         |            |
| Vth [] | V]     |            |            |
| V      | ds=2 V | 0.15/-0.25 | 0.14/-0.26 |
| S [m   | V/dec] |            |            |
| V      | ds=2 V | 65/71      | 76/99      |
| Gm [   | mS/mn  | n]         |            |
| V      | ds=2 V | 247/149    | 345/185    |
| Idsat  | [uA/ur | n]         |            |
| V      | ds=2 V | 374/203    | 523/191    |



Fig. 1. Chip photograph of a 48-bit multiplier embedded in 40KG CMOS/SIMOX gate array LSIs which are sea-of-gate types. Chip size is 4mm x 4mm.



Fig. 2. Measured multiplication time and power dissipation for 48-bit multipliers embedded in CMOS/SIMOX and bulk gate arrays.

and bulk CMOS technologies. As the supply voltage (VDD) decreases, the superiority in speed of the circuit on the SIMOX compared to that on bulk-Si becomes more pronounced. Speed gains of 63 % and 30 % were obtained at VDDs of 1.0 and 1.5 V, respectively. The active power consumption of the SIMOX circuit is smaller than that of the bulk one up to VDD = 2 V. The power dissipation in the SIMOX multiplier is only 8.4 mW at 1 V, and is about 75 % of the 11.2 mW-power dissipation in the bulk one. At low VDDs of less than 1.8 V, a reduction is achieved of more than 50 % in the power-multiplication time product, when compared with the bulk circuit.

## 4. DYNAMIC ANALYSIS

#### A. Delay Time Components

To clarify the origin of speed gain and power reduction, we analyzed delay and parasitic capacitance components in the CMOS/SIMOX and bulk circuits. Total delay time (Td) of the loaded logic gate is approximated by Td=T0 +  $n\cdot$ T1 +  $l\cdot$ T2, where T0 is intrinsic gate delay, n is the fanout number, T1 is delay increment per unit fanout, 1 is wiring length, and T2 is delay increment per unit wiring length (1 mm). This approximation is valid when the effect of wiring resistance on Td is negligible. Each delay component was obtained by using a test structure built on the gate array shown in Fig. 3. and comparing the delay per gate

as functions of n and l. Figure 4 compares the VDD dependence of the delay time for a 2-input NAND gate under typical loading conditions using the two technologies. As the supply voltage lowers, the difference in delay time between the two structures increases further. Based on delay component analysis, delay time is separated into three components. It shows that the difference in total delay is mainly due to an increase in the T0 of the bulk-device.



Fig. 4. Comparison of VDD dependence on delay time for a 2-input NAND gate under typical loading conditions using CMOS/SIMOX and bulk-CMOS technologies. The 1mm-long wire consist of 1/3 mm first-layer metal, 1/3 mm second-layer metal, and 1/3 mm third-layer metal. These wires run on unused basic cells since gate array LSIs are sea-of-gate types.

## **B.** Effective Parasitic Capacitance Components

To investigate the VDD dependence of parasitic capacitance components, effective parasitic capacitance CL is defined as follows,  $CL=T\cdot(Idsn + Idsp)/(2\cdot VDD)$ . Here, T is each delay component, VDD is supply voltage, and Idsn and Idsp are the drain saturation currents of the NMOS and PMOS devices when |Vds|=|Vgs|=VDD. Therefore, total parasitic capacitance is divided into three components conrrespoding to T0, T1, and T2. Figure 5 shows how each delay and the parasitic capacitance component varies with the VDD for SIMOX and bulk CMOS circuits.

Intrinsic gate capacitance (CL0) is composed of the gate oxide capacitance (Cg) and backward-biased drain junction capacitance (Cgd) of the MOSFETs themselves. For T0 and CL0, the SIMOX structure exhibits attractive behavior for low-voltage operation. The CL0 of the SIMOX device reveals only





a slight increase as VDD decreases, because the junction area in the ultrathin-film SOI device has been reduced markedly. On the other hand, the CL0 of the bulk device is almost inversely proportional to the square root of (VDD + Vbi). This means that Cjd is the dominant component in the CL0 of the bulk device. The capacitance ratio [CL0(SIMOX)/CL0(BULK)] is reduced down to 0.58 at VDD = 1.2 V. This reduction is the origin of the 38% intrinsic gate speed gain in SIMOX devices at a VDD of 1.2 V.

Despite the CL0, differences in the fan-out delay (T1) and corresponding load capacitance (CL1) between the two structures are small. For wiring, the capacitance ratio [CL2(SIMOX)/CL2(BULK)] is about 0.75 and is nearly constant for VDDs in the range from 1.0 V to 2.2 V. The 25% reduction in SIMOX is mainly caused by reduced capacitance in the first-layer metal. The reduction is due to the underlying buried oxide layer and the substrate depletion layer below the buried oxide layer, when the substrate is grounded.

In the 48-bit multiplier we fabricated, average capacitive load per inner logic gate is composed of 2.85 fanout and the 0.414-mm wiring. Active power reduction rate in the SIMOX multiplier can be estimated by the ratio of CLava(SIMOX)/CLava(Bulk). Here, CLava is given by CLava= CL0 +  $2.85 \cdot CL1 + 0.414 \cdot CL2$ . The reduction ratio is calculated to be 73-79% at VDDs in a range from 1.0 V to 1.5 V. This value is in agreement with the measured ratio of 75-83 % shown in Fig.2.

These results also suggest that the CMOS/SIMOX structure provides best performance at low supply voltages, especially in typical logic LSIs such as ASICs and in high-speed circuits with a small capacitive load, such as frequency dividers, prescalers, multiplexers, and demultiplexers. In the former case, wiring capacitance (Cw) is considered the main component of the load, due to long wire routing, which is caused by design automation. Thus, reducing Cw enhanced the performance. In the latter case, the use of a large transistor (CL0>CL1+CL2) is an effective way to increase speed performance compared with bulk circuits.

## 5. CONCLUSIONS

In conclusion, 1/4-um-gate ultrathin-film SIMOX and bulk device technologies with 1.4-um-pitch fully-planarized four-level metallization have been applied to a 40-KG CMOS gate array. Fully functional operation of a 48-bit multiplier embedded in the CMOS/SIMOX gate array was achieved with a reduction of more than 50 % in power-multiplication time product compared to bulk circuits at low VDDs of less than 1.8 V. Dynamic analysis shows that the origin of speed gain and power reduction in the SIMOX circuit comes from reductions in intrinsic gate capacitance and



Fig. 5. Comparison of VDD dependence on delay and parasitic capacitance components for CMOS/SIMOX and bulk-CMOS circuits.

wiring capacitance. These results indicate that 1/4-um-gate fully-depeleted CMOS/SIMOX technology is very attractive alternative in achieving high-perforance logic LSI for low-voltage, low-power applications.

## REFERENCES

[1] T. Ohno et al. '93 Symp.VLSI Tech. Dig., p. 25,1993.

[2] Y. Kado et al. '94 IEDM Tech. Dig.,(1994) 665.

[3] M. Miyake et al. IEEE Trans. Electron Devices, ED-36(1989) 392.

[4] S. Nakashima et al. Electron. Lett., 26 (1990) 1647.