# Fine-Grain Power-Gating Scheme of a CMOS/MTJ-Hybrid Bit-Serial Ternary Content-Addressable Memory

Shoun Matsunaga<sup>1</sup>, Atsushi Matsumoto<sup>1</sup>, Masanori Natsui<sup>1</sup>, Tetsuo Endoh<sup>2</sup>, Hideo Ohno<sup>3</sup>, and Takahiro Hanyu<sup>1</sup>

 <sup>1</sup>Research Institute of Electrical Communication (RIEC), Tohoku University 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, JAPAN
Phone: +81-22-217-5508, E-mail: zhao-yun@ngc.riec.tohoku.ac.jp
<sup>2</sup>Center for Interdisciplinary Research, CIR, Tohoku University Aramaki aza Aoba 6-3, Aoba-ku, Sendai 980-8578, JAPAN
<sup>3</sup>Laboratory for Nanoelectronics and Spintronics, RIEC, Tohoku University 2-1-1 Katahira, Aoba-ku, Sendai 980-8577, JAPAN

## 1. Introduction

Drastic increase of static power dissipation due to leakage current is one of the most serious problems in recent nano-scaled VLSI [1]. One possible solution is to use a CMOS/nonvolatile-device-hybrid logic-in-memory circuitry, where nonvolatile storage elements are distributed over a logic-circuit plane, and to cut off the power supply of circuit blocks whenever they are in the standby mode [2]. because the memory data is stored into nonvolatile devices in the circuit blocks. In order to fully take advantage of such a circuit structure, it is important to implement a nonvolatile device that has superior capabilities such as shorter access time, unlimited endurance, scalable write, and small dimension comparable to the employed CMOS technology. The only available candidate of a nonvolatile device that could satisfy all the above requirements at present is the one using a spin-injection writable magnetic tunnel junction (MTJ) device.

So far, we have already presented various CMOS/MTJ-hybrid logic-in-memory circuits [3-5], and confirmed the potential capability of the proposed architecture to design low-power/high-performance VLSIs.

In this paper, we present an ultra-low-power ternary content-addressable memory (TCAM) using a fine-grain power-gating scheme as a typical application of the above circuit architecture.

## 2. Impact of CMOS/MTJ-Hybrid Nonvolatile Logic-in-Memory Circuitry

Fig. 1 shows a structure and a symbol of an MTJ device, whose structure consists of a synthetic ferrimagnetic (SyF) free layer, an MgO tunnel barrier, and a fixed layer [4, 5]. According to the spin (magnetization) direction of the free layer with respect to that of the fixed layer, there are two distinct states as the different resistances of the MTJ device; low resistance R<sub>P</sub> when the spin directions are parallel and high resistance RAP when anti-parallel. Thus, the MTJ device can be considered as a variable resistor, which indicates that the MTJ device has not only nonvolatile storage capability but also switching capability to build a logic device in accordance with stored data. A nonvolatile storage function and a logic function are merged into an MTJ device in the nonvolatile logic-in-memory circuit and the stored logic value does not disappear even if power supply is cut off. Therefore, it is not required to write/read to/from MTJ devices before/after power-off/power-on, which results in realizing quick sleep/wake-up behavior and low power dissipation in the VLSI chip as shown in Fig. 2.

## 3. CMOS/MTJ-Hybrid Bit-Serial TCAM Based on Fine-Grain Power-Gating Scheme

Fig. 3 shows an overall structure of the proposed

bit-serial TCAM. The TCAM consists of several parts; word circuits including a TCAM cell array, sense amplifiers (SAs), accumulators (ACCs), and peripheral circuits such as search-line, word-line, bit-line, and output drivers. The word-parallel bit-serial equality-search operation between a search word (input key) and every word stored into the TCAM cell array is performed. The matched result of the each linear array of TCAM cells is amplified by the corresponding SA. The matched result amplified by the SA and previously matched result are accumulated by the corresponding ACC.

Fig. 4(a) shows the proposed word circuit with a fine-grain power-gating capability. Whenever a new input key is applied and the new equality-search operation is performed, the output of the word circuit is initialized to a high voltage level by controlling the INIT signal, while the power switches (PMOS transistors), SW1, SW2, and SW3, between V<sub>DD</sub> and the match line (ML)/SA in the word circuit are also turned on. The equality-search operation is parallel by word and serial by bit-slice. In each cycle, a bit-level equality-search operation on a single bit-slice is performed. While an input bit is continuously equal to the corresponding stored bit, the output from the SA remains a high voltage level. Once a mismatched result between an input bit and a stored bit is detected in a sequence of bit-level equality-search operations, the output from the SA becomes a low voltage level. As the result, the output from the ACC also becomes a low voltage level, while the SLEEP signal becomes a high voltage level to cut off the power switches between V<sub>DD</sub> and the ML/SA. From then on until the next input key is applied, the power supply of the word circuit (except the ACC) is cut off in order to suppress standby power dissipation. The power gating makes it easily possible to turn off the power switches,  $SW_1$ ,  $SW_2$ , and  $SW_3$ , because there is not a power-supply line in the linear array of the proposed TCAM cells [5] as shown in Fig. 4 and the MTJ devices have nonvolatile storage capability. In the case of a CMOS-based TCAM cell, the power supply can not be cut off even if they are in a standby mode, because stored data must be maintained in volatile memory elements (SRAMs).

The proposed TCAM cell has two storage elements to store three kinds of data, B ("0", "1", and "X" which means "don't care" state for masked equality-search), each of which is encoded as two-bit data ( $b_1$ ' and  $b_2$ ') as shown in Fig. 4(b). The data is written into each MTJ device in advance by activating the write enable signal, WE, word-line signals, WLs, and dual-rail bit-line signals, BL and BL'.

Fig. 5 shows an example of fine-grain power-gating behavior in the 3-bit x 9-word bit-serial TCAM. In the first-bit search operation on the first bit-slice, the input bit is applied to all TCAM cells on the bit-slice, which results in mismatched results in the three rows of the word circuit. In the second-bit search operation, the input bit is applied to all TCAM cells on the second bit-slice. During the second-bit search operation, the three rows of the TCAM cells and SAs are in a standby mode by the power gating, while the additional mismatched results in the two rows of the word circuit are detected. In the third-bit search operation, the number of circuit blocks in a standby mode increases in the same way. According to the word length of the proposed TCAM, the effectiveness of the standby-power reduction by the fine-grain power-gating is increased.

#### 4. Evaluations and Conclusions

Table I summarizes the comparison of the standby power dissipations between a conventional CMOS-based bit-serial TCAM cell array and the proposed one. The CMOS-based TCAM cell array constantly consumes standby power in the volatile storage elements (SRAMs). On the other hand, the proposed TCAM cell array can suppress standby power dissipation because of the nonvolatile capability and the fine-grain power management of them. As the result, it is estimated that the standby power



Fig. 1 MTJ device structure and symbol.



Fig. 2 Power gating in a nonvolatile logic-in-memory system.

dissipation of the proposed TCAM cell array can be reduced to 0.012 percent in comparison with that of the CMOS-based one using 45nm CMOS device parameters under almost the same switching delay and dynamic power dissipation.

From these points of view, it is expected that the CMOS/MTJ-hybrid nonvolatile login-in-memory circuitry with a fine-grain power-gating capability is one of the most effective methods to overcome a static power problem.

#### Acknowledgements

This work was supported by Research and Development for Next-Generation Information Technology by the Ministry of Education, Culture, Sports, Science and Technology of Japan. The authors wish to thank Kimiyuki Hiyama of Tohoku University for help in discussion.

### References

- [1] T.-C. Chen, IEEE ISSCC, 22~28, 2006.
- [2] C.-H. Hua, et., al., IEEE MTDT, 129~134, 2005.
- [3] H. Kimura, et. al., ITC-CSCC, 8C3L-3-1~8C3L-3-4, 2004.
- [4] S. Matsunaga, et. al., APEX, 1, 9, 091301-1~091301-3, 2008.
- [5] S. Matsunaga, et. al., APEX, 2, 2, 023004-1~023004-3, 2009.
- [6] D. Kudithipudi, et., al., IEEE MWSCAS, 783~786, 2007.











Fig. 5 Fine-grain power gating in the proposed bit-serial TCAM.

| Table I | Comparison of st   | tandby power | dissipations | in |
|---------|--------------------|--------------|--------------|----|
| 144-b   | it x 256-word bit- | serial TCAM  | cell arrays. |    |

|               |                                | CMOS<br>(without power gating) | CMOS/MTJ-hybrid<br>(with power gating) |
|---------------|--------------------------------|--------------------------------|----------------------------------------|
| Standby power | Storage elements<br>@TCAM cell | 0.052 W <sup>1)</sup>          | 0 W (@MTJs)                            |
|               | Logic elements<br>@TCAM cell   |                                | 6.356 $\mu$ W (@MOSs) <sup>2)</sup>    |
|               | Total                          | 0.052 W                        | 6.356 µW                               |

 $^{1)}$  The average leakage current of 1.4  $\mu A$  at the power supply of 1.0 V in a 4-transistor-SRAM-based TCAM cell is used [6].

<sup>2)</sup> Gate leakage and sub-threshold leakage currents of the two NMOS transistors in the proposed TCAM cell are considered at 45nm CMOS from ITRS 2007.