# A power-gated 32bit MPU with a power controller circuit activated by deep-sleep-mode instruction achieving ultra-low power operation

H. Koike<sup>1,4</sup>, T. Ohsawa<sup>1,4</sup>, S. Miura<sup>1,5</sup>, H. Honjo<sup>1,5</sup>, K. Kinoshita<sup>5</sup>, S. Ikeda<sup>1,2</sup>,

T. Hanyu<sup>1,2</sup>, H. Ohno<sup>1,2</sup>, and T. Endoh<sup>1,3,4</sup>

<sup>1</sup> Center for Innovative Integrated Electronic Systems, <sup>2</sup> Research Institute of Electrical Communication,

<sup>3</sup> Graduate School of Engineering, Tohoku Univ., <sup>4</sup> JST, ACCEL, <sup>5</sup> Green Platform Research Labs, NEC Corp.

468-1 Aramaki aza Aoba, Aoba, Sendai, Miyagi 980-8579, Japan,

Phone: +81-22-796-3410 E-mail: tetsuo.endoh@cies.tohoku.ac.jp

#### Abstract

A spintronics-based power-gated MPU is proposed. It includes a power controller circuit activated by the newly supported power-off instruction for deep -sleep-mode. These means enable the power-off procedure for the MPU to be executed appropriately. The prototype fabricated using 90 nm CMOS and 100 nm MTJ process has successfully operated. The measurement results show that the operation energy can be decreased to 1/28, when the operation duty (active cycle / total cycle) is 10%, compared to non-power-gated MPU.

#### 1. Introduction

To solve the increasing standby power of LSIs for future scaled CMOS, power-gating techniques using nonvolatile devices are actively researched [1-2]. Perpendicular magnetic tunnel junction (MTJ) is a promising nonvolatile device due to its fast switching, high endurance, and good scalability [3]. In the parallel processing techniques used in recent MPUs as shown in Fig. 1, some MPU/processor element (PE) will be idle because 100% operation ratio is impractical. Such idle units should be fully powered-off (in "deep-sleep-mode"), and we reported that power-gated MPU using nonvolatile flip-flops (NV-FFs) is effective [4]. It can overcome performance degradation by entry/exit delay in the power on/off sequence. However, it is necessary to control power-off/on procedure in order to make correct power-gating operation. This paper proposes the function and power control circuit required for such power-gated MPUs, and estimates its effectiveness by test chip fabrication and measurements.

## 2. Proposed power-gated MPU with deep-sleep-mode controlled by power-off instruction

For power-gating operation, controlling by software has an advantage that programmers can manage power-gating resulting in achieving optimal power-control for each application. To realize this, it is necessary to define the instruction corresponding to the power-gating (hereafter, "poff" instruction), which can be designated in assembler programs. We propose a new power-gated MPU as shown in Fig. 2. This MPU includes the poff instruction and the power control circuit appropriate to the power-gated MPU [4]. The instruction set of the MPU is listed in Table I.

Fig. 3 shows the timing chart for the proposed power on/off sequence of the MPU. In a nonvolatile MPU, the method of data saving to nonvolatile devices is categorized into two ways: (1) the data saving in every operation cycle, (2) the data saving for all flip-flops just before power-off. The method (1) has a merit that the exit delay is minimum because of instant power-off on demand, but has a demerit that write operation to nonvolatile devices occurs frequently. On the contrary, the method (2) requires a rather complicated procedure, but requires minimum write operation to nonvolatile devices. Here, we adopt the method (1) for the instruction / data memories and register files on the assumption of using the SRAM type STT-MRAM [5], and the method (2) for the program counter (PC) and the pipeline register (PR). Thus the memories store nonvolatile data in the "Normal operation" period, while the flip-flops do in the "NV-FF data save" period in Fig. 3. This achieves an optimized combination of power-off delay and write frequency for MTJs, because the PC and PR are accessed at every clock cycle while the memories are not. Special care is needed so that the power controller correctly operates when the power-gated supply line (VDD PG) is cut off.

### 3. Chip fabrication and measurements

Fig. 4 shows the MPU prototype chip photo, which was designed and fabricated using 90 nm CMOS and 100 nm MTJ process. The power control circuit is laid out around the MPU main body, providing sufficient switch PMOS sizes. The non-power-gated supply line (VDD\_NP) encloses the power controller. Its measured operating waveforms are shown in Fig. 5, which demonstrates the successful power-off operation after receiving the poff\_signal.

The effect of this MPU is estimated using measurement results. Fig. 6 compares (a)  $E_{P}$ , the data retention energy for the power-gated MPU (nonvolatile data saving and loading), and (b)  $E_N$ , the energy for the non-power-gated MPU (volatile data retaining). Both  $E_P$  and  $E_N$  are shown as a function of standby time in terms of the number of idle clock cycles. The  $E_P$  has a constant value against the number of idle clock cycles, because the MPU is powered down after data saving and no power dissipates except the save/load data to/from the MTJs. But it varies depending on MTJ's switching characteristics. In Fig. 6, the write pulse width = 5 ns. The crossover point between  $E_P$  and  $E_N$  indicates the border of the number of idle clock cycles, at which the power-gated MPU becomes more advantageous in energy than the non-power-gated MPU. The crossover point is about 750 as indicated in Fig. 6. Fig. 7 shows the energy reduction ratio (= energy for the power-gated MPU / energy for the non-power-gated MPU) against the operation

duty (= active cycle / total cycle). For 10% duty, energy reduction of about 1/28 is achieved by the power-gated MPU.

#### 4. Conclusion

A power-gated MPU with a power control circuit and a power-off instruction was proposed. The effect of power reduction for the MPU was revealed based on chip measurement results, which shows significant energy reduction of 1/28 when the operation duty is 10%, compared to a non-power-gated MPU.



Fig. 1. Examples of typical parallel processing methods in recent years: (a) Multi-core processor, (b) Single instruction multiple data (SIMD).

#### Acknowledgement

This work was supported in part by JSPS through the FIRST Program, and by JST, ACCEL.

#### References

- [1] K. Nomura et al, J. Appl. Phys., 111, 07E330 (2012).
- [2] M. Natsui et al., ISSCC, pp. 194-195, Feb. 2013.
- [3] S. Ikeda et al, Nature Materials, 9, pp. 721-724 (2010).
- [4] H. Koike et al, A-SSCC, pp. 317-320, Nov. 2013.
- [5] T. Ohsawa et al, *IEEE J. Solid-State Circuits*, pp. 1511-1520, vol. 48, no. 6, Jun. 2013.

Table I Instruction codes. The code of poff has reserved bits for further extension of the function. Code bit

| Inst. | 31302928272625242322212019181716151413121110 9 8 7 6 5 4 3 2 1 0 |
|-------|------------------------------------------------------------------|
| add   | $0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0$                       |
| sub   | $0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 1\ 0\ 0\ 1\ 0$                       |
| and   | $0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 1\ 0\ 0\ 1\ 0\ 0$                    |
| or    | $0\ 0\ 0\ 0\ 0\ 0\ 0\ 0\ 1\ 0\ 1\ 0\ 1$                          |
| nor   | $0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \$                        |
| slt   | $0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \$                        |
| l w   | 1 0 0 0 1 1<                                                     |
| SW    | 1 0 1 0 1 1 (- RS -> (- RT -> Data memory address>               |
| beq   | 0 0 0 1 0 0<                                                     |
| bne   | 0 0 0 1 0 1<                                                     |
| nop   | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0                          |
| halt  | 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0                          |
| poff  | 1 1 1 1 1 0<                                                     |







Fig. 3. Timing chart for the power-gating operation of the MPU.



Fig. 5. Operating waveforms.



Fig. 6 Data retention energy comparison.

