# Design of an MTJ-Oriented Nonvolatile Lookup Table Circuit with Write-Operation Minimizing

Daisuke Suzuki<sup>1</sup> and Takahiro Hanyu<sup>2</sup>

<sup>1</sup> Frontier Research Institute for Interdisciplinary Sciences, Tohoku university, Tohoku University, Sendai, 980-8578 Japan. <sup>2</sup> Research Institute for Electrical Communications, Tohoku University, Sendai, 980-8577 Japan.

Phone: +81-22-217-5508, E-mail: daisuke.suzuki.e6@tohoku.ac.jp

#### Abstract

A nonvolatile lookup table (LUT) circuit with a shiftregister function (SR) and a distributed memory (DM) function is proposed using a magnetic tunnel junction (MTJ) oriented circuit design. By incrementing the address of the configuration memory in the LUT circuit, the state of the MTJ device is serially read and written, which results in the SR function with minimum write access. Moreover, since the decoder for the LUT function can be used for both SR and DM functions, the hardware overhead is quite small. In fact, the transistor counts of the proposed LUT circuit is reduced by 60% compared to that of the conventional SRAM-based implementation with the same degree of power consumption for the SR function.

## 1. Introduction

Although a field-programmable gate array (FPGA) is currently used in a wide range of applications, standby power consumption is a critical issue for battery-powered or energyharvesting applications [1]. A nonvolatile FPGA [2-4] is a promising solution for the standby-power problem. Especially, the use of a magnetic tunnel junction (MTJ) device is a viable candidate owing to its 3D-stacking capability, CMOS compatibility, and virtually unlimited endurance.

Meanwhile, there are two design issues for an MTJ-based lookup table (LUT) circuit which is the fundamental component of the FPGA. One is large area overhead of the sense amplifier (SA) due to small difference in MTJ resistance. To reduce the overhead, we have proposed a logic-in-memory (LIM) structure. By utilizing the LIM structure, the LUT circuit is implemented 62% fewer transistors compared to the conventional circuit [5]. Another important issue is large power consumption for performing the shift-register (SR) function which is arisen from CMOS-oriented circuitry. In this paper, a nonvolatile LUT circuit using an MTJ-oriented design for minimizing write-access in the SR function.

## 2. Proposed Nonvolatile LUT Circuit

Figure 1 (a) shows an SRAM-based K-input LUT circuit which has a distributed memory (DM) function and the SR function. These functions are embedded to enhance the operation efficiency of the FPGA [6]. A truth table for the K-input logic function or input of the DM are stored in the  $2^{K}$  configuration memory cells (M[0], M[1], ..., M[2^{K-1}]) by using a decoder. Additional path for the SR function is added to the configuration memory cell. The external shift-register input (D<sub>M</sub>) propagates by applying a clock signal (PHI), and the

shift-register output (OUT) serially appears via an NMOS tree. Therefore, any arbitrary length of shift register can be implemented by changing the external logic input X.

In this paper, a three-terminal MTJ (3T-MTJ) device, such as a spin-orbit torque device [7], is used for the LUT circuit design. Note that, the proposed circuitry is also applicable to a two-terminal MTJ device. Figure 2 shows a symbol of the 3T-MTJ device. A binary data is written as the resistance value ( $R_L$  or  $R_H$ ) by applying a bi-directional write current  $I_{WR}$ between T2 and T3. The data is read by applying a read current  $I_{RD}$  between T1 and T3.

Figure 3 shows the schematic of the proposed K-input LUT circuit that is implemented based on a single-ended circuitry [7]. The SA and the write driver (WD) are shared among 3T-MTJ devices. The read access and write access are individually controlled by the two transistors in the configuration memory cell. By utilizing 3T-MTJ devices, the read and write properties are individually optimized, which makes it possible to enhance the circuit performance [7]. The decoder is used for both the read operation and the write operation. The proposed LUT circuit has three modes, the LUT mode, the DM mode, and the SR mode. During the LUT mode, SHIFT is set low and the configuration data (D<sub>IN</sub>) is written to the MTJ device in accordance with the address input (ADRS) when WE=1, and the logic operation is performed by a 6-bit logic input (X) when RE=1. During the DM mode, the same control signals as the LUT mode are applied and  $D_{IN}$  is used for the RAM input. During the SR mode, SHIFT is set high and the shift operation is performed using an external logic input (D<sub>MIN</sub>) and a bit-select signal (Y). Figure 4 shows the SR mode of the proposed 2-input LUT circuit. At each cycle, the state of the corresponding 3T-MTJ device is read by the SA, and then, D<sub>MIN</sub> is written to the same 3T-MTJ device. Then, the address is incremented and the state of the next MTJ device is read and its state is updated. In this way, the SRAMbased LUT equivalent SR function is realized with only one write access. Thus, there is no significant power overhead due to the MTJ write access. Moreover, the hardware overhead is also small because the decoder for the LUT function and the DM function can also be used for the SR function.

## 3. Evaluation

For the evaluation, the proposed LUT circuit is designed using a 90-nm CMOS technology together with an MTJ device model [8] whose parameters are shown in Table I. Figure 5 summarizes the comparison of the transistor count of the LUT circuits. The amount of transistor count reduction of the proposed LUT circuit compared to that of SRAM-based one increases as the number of inputs because the configuration memory cell is composed of only two transistors. Moreover, no additional decoder is required for the SR function. Figure 6 shows the comparison of power consumption during the 1bit shift operation of the 6-input LUT circuit. By utilizing the proposed method, the number of MTJ-write access in minimized, resulting in the same degree of power consumption as the SRAM based LUT circuit.

## 4. Conclusions

A compact, MTJ-write-access minimized nonvolatile LUT circuit has been proposed. By utilizing address incrementing, the number of MTJ write access during the SR operation is minimized with small hardware overhead.

#### Acknowledgements

This research is supported by ImPACT of CSTI and CIES consortium program.

#### References

- [1] F.-Li Yuan, et al., IEEE J. Solid-State Circuits 50 (2015) 137.
- [2] Y. Y. Liauw, et al., ISSCC Dig. Tech. Pap., (2012), p. 406.
- [3] Y. Tsuji, et al., VLSI Tech. Dig. Tech. Pap., (2012), p. 86.
- [4] D. Suzuki, et al., VLSI Circuits Dig. Tech. Pap., (2015), p. 172.



Fig. 1: SRAM-based LUT circuit with shift-register function: (a) configuration memory, (b) overall structure.



Fig. 2: 3T-MTJ device.



Fig. 3: Proposed nonvolatile LUT circuit.

- [5] D. Suzuki, et al., J. Appl. Phys. 52 (2013) 04CM04.
- [6] US Patent, 5889413, 1999.
- [7] S. Fukami, et ,al, Nature Matt. 15 (2016) 535.
- [8] N. Sakimura, et al., IEEE ISCAS (2012), p. 2012.



Fig. 4: Shift-register function in the proposed 2-input LUT circuit based on address incrementing.







Figure 6: Comparison of power consumption in SR operation.