A 28nm High-k/Metal-gate Symmetric 10T 2RW Dual-port SRAM bitcell design

Tien Yu Lu1, Chun Hsien Huang1, Shou Sian Chen1, Yu Tse Kuo1, Ching Cheng Lung1, Osbert Cheng1, Yuichiro Ishii2, Miki Tanaka2, Makoto Yabuuchi2, Yohei Sawada2, Shinji Tanaka2 and Koji Nii2
1 United Microelectronics Corporation (UMC), Advanced Technology Development Division, Nanke 3rd Rd., Tainan Science Park, Tainan, Taiwan.
2 Renesas Electronics Corporation, 5-20-1, Josuihon-cho, Kodaira, Tokyo, 187-8588, Japan

Abstract

We propose a highly symmetrical 10T 2-read/write (2RW) dual-port (DP) SRAM bitcell in 28-nm high-k/metal-gate planar bulk CMOS. It replaces the conventional 8T 2RW DP SRAM bitcell without any area overhead. It significantly improves robustness of process variations and an asymmetric difference between the true and bar bitline pairs. Measured data show that read current (I\text{read}) and SNM are respectively boosted by +20\% and +15 m\text{V} by introducing proposed bitcell with enlarged pull-down (PD) and pass-gate (PG) NMOS. Measured V\text{min} of proposed 256-kbit 10T DP SRAM is 0.53 V at TT process, 25\°C under the worst access condition with read/write disturbances, improved by 90 m\text{V} (15\%) compared to the conventional one.

1. Introduction

In deep-submicron SoC devices, the robust design under process variations becomes essential. Such SoC products require a variety of embedded memories for many kind of applications. Embedded dual-port (DP) SRAMs [1-8] as well as single-port SRAMs have played an important role as shared caches in high-performance computing with many cores for datacenter, buffer memories for high-speed communication and image processing. However, DP SRAM has inherent disturbance issues whenever an access conflict is occurred simultaneously from both ports [1,2]. In this paper, we propose a highly symmetrical DP SRAM bitcell design without any area overhead. The minimum operating voltage (V\text{min}) can be improved at worst access conditions with read/write disturbances, enhancing the robustness against device variations.

2. Dual-Port Cell Design

Fig. 1 shows scaling trends of embedded 2RW DP SRAM bitcell size. A half area of bitcell size has been achieved in each node along with the technology scaling. In advanced 28-nm high-k/metal-gate (HKMG) planar bulk CMOS technology, the proposed 2RW DP bitcell size is 0.315 \text{um}^2 which is same as conv. one [8]. Fig. 2 shows the schematic of prop. 10T DP SRAM bitcell. Fig. 3 depicts SEM photos and layout plots of both conv. 8T and prop. 10T bitcells in 28-nm. In the conv. 8T bitcell, the outside BL-A and BLB-B take longer current transmission path through the metal-gate wiring. Those have higher resistances than the short current paths of inside BL-B and BLB-A, so that the conv. 8T bitcell induces the read current (I\text{read}) mismatch of BL-pairs. The diffusion rounding and the other layout dependent effect (LDE) such as STI stress also induce an extra I\text{read} mismatch. Those undesirable I\text{read} mismatch might affect V\text{min} degradations or speed overheads with an extra sense timing margin. On the other hand, in the prop. 10T bitcell, the pull-down (PD) NMOSs are divided into two parallel devices with straight shape of diffusion for lithography friendly design as shown in Fig. 3(b). Both gate widths of PD and pass-gate (PG) NMOSs can be enlarged compared to the conv. 8T bitcell, enabling fine tuning feasibility. The prop. 10T symmetric layout enables the short and equidistant current transmission paths for both A- and B-port BL pairs, reducing the I\text{read} mismatch.

![Fig. 1 Scaling trends of 2RW DP SRAM bitcell size.](image1)

![Fig. 2 Schematic of prop. 10T 2RW DP SRAM bitcell.](image2)

![Fig. 3 SEM photos and layout plots of 28-nm DP SRAM bitcells: a) Conv. 8T DP bitcell, b) Prop. 10T DP bitcell.](image3)
5 plots the distributions of measured average \( I_{\text{read}} \) and read static noise margin (SNM). Measured data show that the \( I_{\text{read}} \) in the prop. bitcell is boosted by 20%. Here, each plotted \( I_{\text{read}} \) shows the minimum average current of BL-A, BLB-A, BL-B, and BLB-B. Measured SNM in the prop. 10T bitcell is also improved by 15% thanks to the symmetrical layout.

Fig. 4 Measured average \( I_{\text{read}} \) of 25 skew wafers for both A- and B-port BL pairs in 28-nm HKMG planar bulk CMOS: (a) conv. 8T DP SRAM bitcell, (b) prop. 10T DP SRAM bitcell.

3. Test Chip Design and Measurement

Fig. 6 is a die photograph of a test chip using 28 nm HKMG bulk CMOS technology. Eight 32-kbit DP SRAM macros using prop. symmetric 10T DP bitcell are implemented. Each 32-kbit macro has 32-bit \( \times \) 1024-word with 4-column multiplexer (mux=4), where the physical macro size is 183.2 \( \mu \)m \( \times \) 85.8 \( \mu \)m. The total capacity of DP SRAM macros in a die is 256-kbit. We observed full read/write functions of 256-kbit DP SRAM at temperatures of -40°C to 125°C.

Fig. 5 Measured average \( I_{\text{read}} \) and average SNM/WRM for 25 skew wafers. (a) \( I_{\text{read}} \) (conv. 8T) and \( I_{\text{read}} \) (prop. 10T) vs. average \( I_{\text{read}} \) (ref. 8T), (b) SNM vs. WRM for both conv. 8T and prop. 10T.

Fig. 6 Photograph of a test chip and layout plot of 32-kbit DP SRAM macro using prop. symmetric 10T DP SRAM bitcell.

Fig. 7 shows cumulative distribution functions (CDFs) of \( V_{\text{min}} \) at 25°C. The total number of measured dies is 15. The median of \( V_{\text{min}} \) of prop. DP SRAM for TT process is 0.53 V at worst access condition with read/write disturbances, improved 90 mV (15%) compared to the conv. one by introducing new symmetrical 10T DP bitcell. Fig. 8 shows the distribution of each measured standby leakage power of 256-kbit macro at 1.0 V typical voltage and 25°C. There is no difference b/w conv. and prop., not observed any tailing failures. Test chip features are summarized in Table I.

![Image 56x449 to 283x547]

![Image 60x589 to 280x685]

Fig. 7 Measured \( V_{\text{min}} \) distributions of DP SRAM macros under 1-port, read-disturbance (R-dist.), and write-disturbance (W-dist.) access modes at process-TT, 1.0V, 25°C.

Fig. 8 Measured standby leakage power of 256-kbit DP SRAM macros at process-TT, 1.0V, 25°C.

Table I: Features of the test chip.

<table>
<thead>
<tr>
<th>Process</th>
<th>Capacity</th>
<th>Physical macro size (@ 32-bit Density)</th>
<th>Bitcell size</th>
<th>( V_{\text{min}} ) @ 256-kbit, 25°C (Median)</th>
<th>Leakage @ 256k-bit, 25°C (Median)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Conv. 8T</td>
<td>28 nm HKMG planar bulk CMOS</td>
<td>183.2 ( \mu )m ( \times ) 85.8 ( \mu )m (2.0 Mbit/mm²)</td>
<td>3.296 ( \mu )A ( \times ) 0.243 ( \mu )A (935 ( \mu )A/m²)</td>
<td>0.12 V</td>
<td>42.8 ( \mu )A</td>
</tr>
<tr>
<td>Prop. 10T</td>
<td></td>
<td></td>
<td></td>
<td>0.53 V</td>
<td>41.4 ( \mu )A</td>
</tr>
</tbody>
</table>

4. Conclusions

Highly symmetrical 10T 2RW DP SRAM bitcell was proposed in 28-nm HKMG planar bulk CMOS technology. \( I_{\text{read}} \) and SNM were boosted by 20% and 15 mV compared to the conv. 8T bitcell without any area overhead. The mismatch issue of BL pairs was significantly improved in both ports. Test chip including 256-kbit 10T DP SRAM was successfully demonstrated. Measured \( V_{\text{min}} \) was improved by 15% thanks to the symmetrical bitcell layout design.

Acknowledgements

We would like to express sincere thanks to all the contributors to the Renesas SRAM design team, Renesas System Design test team, UMC shuttle service staffs, and UMC SRAM development team.

References