# Nonvolatile FPGA Using 2T-1MTJ-Cell-Based Multi-Context Array for Power and Area Efficient Dynamically Reconfigurable Logic

Daisuke Suzuki<sup>1</sup> and Takahiro Hanyu<sup>1, 2</sup>

<sup>1</sup>Center for Innovative Integrated Electronic Systems, Tohoku University, 980-0845, Sendai Japan <sup>2</sup>Research Institute of Electrical Communication, Tohoku University, 980-8577, Sendai Japan Phone +81-22-217-5508, Email: <u>show-you@ngc.riec.tohoku.ac.jp</u>

## Abstract

A dynamically reconfigurable nonvolatile field-programmable gate array (FPGA) with a multi-context (MC) cell array structure is proposed using 3-terminal magnetic tunnel junction (MTJ) devices. The use of single-ended circuitry together with a 2-transistor and 1-MTJ (2T-1MTJ) context cell allows to minimize the area overhead of the context array with nonvolatile storage capability. Moreover, because the 2T-1MTJ cell has no power line, the leakage current overhead is also minimized. In fact, 59% of transistor counts and 71% of leakage power during power on are reduced compared to those of SRAM-based implementation in 40-nm CMOS technology.

### 1. Introduction

An MC-FPGA [1-2] where configuration data are instantaneously switched on demands is one viable candidate for a dynamically reconfigurable logic platform. However, cell-area inefficiency and a large amount of power consumption due to leakage current are critical in an SRAM-based implementation. To overcome such problems, a nonvolatile MC-FPGA has been presented using C-axis aligned crystalline In-Ga-Zn-O (CAAC-IGZO) devices [3,4]. Meanwhile, the configuration memory cell requires a capacitor to retain stored information, which decreases cell area efficiency. Moreover, because each context cell includes power line for generating logic high level, the total amount of leakage current is increased according to the number of contexts. From this point of view, it is important to minimize area and leakage current overheads for the power and area efficient dynamically reconfigurable logic.

In this paper, a nonvolatile MC-FGPA is proposed to achieve these requirements by utilizing 3-terminal MTJ devices [5] together with single-ended circuitry [6, 7].

## 2. 2T-1MTJ-Cell-Based Nonvolatile MC-FPGA

Figure 1 (a) shows the overall structure of the proposed MC-FPGA which comprises an array of tiles. Since power switches are placed to each tile, all the idle tiles can be turned off and wasted leakage current can be eliminated. Figure 1 (b) shows the cross sectional view and the schematic of the proposed 2T-1MTJ context cell.  $M_{CS}$  is used for context selection and  $M_{WR}$  are used for writing, respectively. Since the 3-terminal MTJ device is stacked over the two transistors, it does not affect the effective area. The 3-terminal MTJ device stores binary data Y as  $R_P$  (Y = 0)

or  $R_{AP}$  (Y = 1). Configuration data is written by the bi-directional write current  $I_{WR}$  independent of the MTJ resistance [6]. Figure 2 shows the block diagram of the proposed tile. The configurable logic block (CLB) is composed of CLB slices which comprise from a multiplexer (MUX) tree, configuration memories (CMs), and a K-input logic element (LE). The routing block (RB) is composed of an array of routing switches, each of which includes 1-bit CM with NMOS pass gate. The power switches are placed to each CLB slice and the RB. The CM is also used store the information whether a function block is turned off or not [4, 5]. Run-time power gating can also be supported in the proposed tile by activating PG signal high.

As shown in Fig. 2, it is important to design MC-CM and MC-LE is fundamental component of the MC-FPGA. Figure 3 and Fig. 4 show the proposed MC-CM and MC-LE, respectively. By utilizing single-ended voltage sensing [6, 7], configuration data can be recalled by just one 3-terminal MTJ device which minimizes cell area overhead. Since inactive context arrays are electrically separated from the read current path, they can be instantaneously reconfigured during logic operation with bit-parallel write scheme [7].

## 3. Evaluation

For the evaluation, a CLB slice with 4 contexts is designed using 40-nm CMOS technology.  $R_P$  and  $R_{AP}$  are set 8 k $\Omega$  and 32 k $\Omega$ , respectively. Figure 5 shows the basic behavior of the proposed 6-input MC-LE where XOR, XNOR, AND, NAND functions are stored. We can confirm that these contexts are immediately switched in accordance with CS<sub>0</sub>, CS<sub>1</sub>, CS<sub>2</sub>, and CS<sub>3</sub>.

Table I compares performances of the three MC-CLB slices; an SRAM-based one, a CAAC-IGZO-based one, and the proposed one. While CAAC-IGZO-based implementation reduces the transistor counts compared to those of SRAM-based one, a large number of capacitors are required. In contrast, the effective area of the proposed implementation is dominated by just MOS transistors since MTJ devices are stacked over the CMOS plane. Moreover, the transistor count is the smallest among the three with maintaining the same degree of performance. Figure 6 summarizes the standby power overhead due to the increase of contexts. In contrast to the other implementations, the proposed CLB slice exhibits the smallest standby power overhead since the 2T-1MTJ-based context cell has no power lines. The leakage power is completely eliminated by turning off the power switch.

## Acknowledgements

This work is supported by JSPS FIRST program. This work is also supported by JSPS KAKENHI (25870067) and VDEC.

#### References

- [1] S. Trimberger, et al., Proc. IEEE Symp. FCCM, 1997, p. 22.
- [2] N. Miyamoto, et al., Proc. ASSCC, 2008, p. 89.
- [3] T. Aoki, et al., ISSCC Dig. Tech. Pap., 2014, p. 502.
- [4] M. Kozuma, et al, JJAP **53** (2014) 04EE12.
- [5] S. Fukami, et al., IEEE Trans. Magn. 48 (2012) 2152.
- [6] D. Suzuki, et al, JJAP **52** (2013) 04CM04.
- [7] D. Suzuki, et al, JJAP 53 (2014) 04EM03.



Figure 1: Proposed MC-FPGA: (a) Overall structure. (b) 2T-1MTJ-based context cell.



Figure 2: Block diagram of the proposed tile.



Figure 3: Schematic of the proposed 2-bit MC-CM.



Figure 4: Schematic of the proposed 6-input MC-LE.



Figure 5: Basic behavior of the proposed 6-input MC-LE.

| Table. I: Performance comparison of MC-CLB slices (4 contexts). |            |      |                                |                          |
|-----------------------------------------------------------------|------------|------|--------------------------------|--------------------------|
|                                                                 |            | SRAM | CAAC-IGZO [3,4] <sup>*5)</sup> | This work <sup>*5)</sup> |
| Active power $^{*1,2)}\left[\mu W\right]$                       |            | 28.5 | 28.2*4)                        | 19.1                     |
| Delay *1, 3) [ns]                                               |            | 0.53 | 0.57*4)                        | 0.59                     |
| Power-delay product<br>[µW*ns]                                  |            | 15.2 | 16.1                           | 11.3                     |
| Device<br>counts                                                | Transistor | 2884 | 1522                           | 1276                     |
|                                                                 | NV-storage | N/A  | 716 IGZOs / 358Caps. *6)       | 365 MTJs                 |

(\*1) HSPICE simulation under 40-nm CMOS technology ( $V_{DD}$  = 1.1V). (\*2) Total average power to perform logic operations (Frequency: 800 MHz).

(\*3) Worst delay of the output node (Z).

(\*4) Each context cell information is given by .ic command for HSPICE.
(\*5) The same size of power switches are used.

(\*6) The capacitance is 10fF.



Figure 6: Leakage power comparison of CLB slices.