# A Compact Bell-Shaped Analog Matching Cell Module for Digital-Memory-Based Associative Processors

Trong Tu Bui and Tadashi Shibata

Department of Frontier Informatics, Graduate School of Frontier Sciences, The University of Tokyo 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8561, Japan

Phone: +81-4-7136-3853, Fax: +81-4-7136-3855, email: tubui@else.k.u-tokyo.ac.jp, shibata@ee.k.u-tokyo.ac.jp

# 1. Introduction

Following the Moore's law, the number of transistors on a chip will soon reach  $10^{10}$ , the number equivalent to the total number of neurons in the neocortex of the human brain. This will certainly enhance the capacity of digital memory chips. However, it is not easy to make digital CPUs compatible to human-like flexible computation or so-called soft computing.

It has been demonstrated that associative processors can serve as the base of soft computing, and some examples of flexible image perception have been demonstrated based on analog as well as digital associative processors [1, 2]. In analog associative processors, bell-shaped I-V characteristics, or resonance characteristics, were utilized as the base of building matching cells. This is because such resonance characteristics can represent the correlation between the input data and the template data in a sense that the output current peaks when the input voltage coincides with the peak voltage. The resonance characteristics of single electron transistors were utilized to build an associative processor for color classification [3]. Since resonance characteristics are the typical nonlinear characteristics observed in quantum devices, such associative processors would be one of the most promising system applications in the era of nano devices.

The purpose of this study is to develop a compact resonance-characteristics matching cell using only NMOS transistors and to build an analog matching cell module compatible to integrating with digital memories for compact implementation of associative processors. In addition, a calibration scheme that can compensate for the fluctuation due to device mismatches has also been developed.

In the rest of the paper, after describing the system architecture, the circuitries utilized in the prototype chip design along with the simulation results are presented. It is also shown that a digital memory chip can be converted to an intelligent associative processor only sacrificing about 15% chip real estate for matching cell module.

## 2. System architecture

## Proposed architecture

The block diagram of the associative processor shown in Fig. 1 consists of two main parts, the digital memory module and the proposed analog matching cell module. The memory module is utilized to store template data representing the past experience or knowledge. The similarity evaluation between the input data and the template data is carried out in parallel by matching circuits in the matching cell module. All data are represented as 64-

dimention vectors [2]. The winner-take-all (WTA) circuit [1,5] determines the maximum likelihood template vector and identifies the vector location. Serial digital-to-analog converters (SDACs) are used to convert digital values to analog voltages prior to similarity evaluation processing.





A one-vector matching circuit consists of 64 matching cells (MCs) each of which is used to determine the similarity between each element of the input vector and the corresponding element of the template vector. The matching cell circuit shown in Fig. 2 has bell-shaped *I-V* characteristics and is composed of only eight NMOS transistors. As a result, a very compact matching cell has been realized. The result of the evaluation from each matching cell is given as an output current ( $I_{out}$ ). A larger current indicates larger similarity. The total matching score between the input and the template vector is obtained by taking the wired sum of all  $I_{out}$ 's from the element matching cells as shown in Fig. 1.

The peak height of the output current  $I_{out}$  is programmable by varying the voltage connected to the floating gates,  $V_{ref}$ . The larger  $V_{ref}$  is, the larger the peak current becomes. If the  $V_{ref}$  is set sufficiently small, transistors operate in the sub-threshold regime which yields the opportunity of very low-power operation. Fig. 3 shows the characteristics of the matching cell obtained by SPICE simulation.

## Serial digital-to-analog converter (SDAC)

The SDAC (shown as D/A in Fig. 1) is elegant in its simplicity. It requires only two capacitors and a few switches to implement. Fig. 4 shows the basic configuration of the SDAC. Due to its small size, the SDAC is a much better choice of DACs for the proposed architecture. *Calibration circuitry* 

Due to the variations in manufacturing processes, device parameters vary from one device to another. Consequently, process variations influence the matching circuit behaviors, and the result of the evaluation process, therefore, may lead to errors. To mitigate this problem, a calibration scheme has been developed as explained below.

In this scheme, the similarity is determined by the difference between the peak current and the output current at the moment of data matching. In the previous approaches [1, 4], the output current itself was utilized as the matching result. Fig. 5 illustrates two matching cells having different characteristics caused by device mismatches. Error 1 and error 2 refer to the former method and the latter one, respectively. It is shown that the differential current method is better. In order to implement this method, peak currents are stored into current memories in phase 1 of the operation. In phase 2, differences of currents are obtained. Only phase 2 is repeated to each new input vector. This scheme is shown in Fig. 6.

#### 3. Simulation results

The proof-of-concept chip was designed in a 2-poly, 3metal, 0.35µm CMOS technology and sent to the fabrication. The SPICE simulation of the proposed matching cell module including 32 template vectors is shown in Fig. 7. Template vector number 9 is the winner and matching time is 1µs per vector. In phase 1 of the operation, all template vectors are memorized into the matching cell array in turn. In phase 2, similarities between the input vector and template vectors are determined, and the address of the winner vector is given out. The simulation results verify the correct chip operation.

In the layout of the proof-of-concept chip, most of the matching cell area is occupied by capacitors ( $C_1$  and  $C_2$  in Fig. 2). If we assume high-k MIM capacitance technology is available in 0.18µm technology, a 50fF/µm<sup>2</sup> capacitance can be easily obtained by utilizing higher metal layers. Then, the test layout of 256-vector matching cell module can be archived in an area as small as 0.5mm<sup>2</sup>. Fig. 8 illustrates the module integrated with SRAM memories, where the matching cell module occupies only 15% of the SRAM area.



Fig. 2 (left): Matching cell circuit. Fig. 3 (right): Characteristics of the matching cell.



#### 4. Conclusions

Device mismatch problem in analog associative processors has been resolved by a new calibration technology, which was not addressed in previous works. Moreover, we implemented the matching cell by an only-NMOS circuitry which is much smaller than CMOS implementation [1] and much easier to control as compared to analog-flash-memory implementation proposed in [4]. In addition, the architecture is very compact and compatible to integration with existing digital memories.

At this moment the experimental results have not been available yet because the prototype chip is under fabrication, but we hope they are ready at the time of the conference. Acknowledgements

The VLSI chip has been fabricated in the fabrication program of VLSI Design and Education Center, The University of Tokyo in collaboration with Rohm Corporation and Toppan Printing Corporation.

#### References

- [1]. T. Yamasaki and T. Shibata, Trans. Neural Networks, (2003).
- [2]. M. Yagi and T. Shibata, Trans. Neural Networks (2003).
- [3]. M. Saitoh, H. Harata, and T. Hiramoto, *IEDM* (2004).
- [4]. M. Ogawa and T. Shibata, ESSCIRC (2001).
- [5]. K. Ito, M. Ogawa, and T. Shibata, *Jap J. Appl. Phys* (2002). **x**<sub>k</sub> SW1 SW2







(256 vectors) Fig. 8: Test layout of a 256-vector matching cell module integrated with digital memories assuming high-k MIM capacitors.

Matching array

SRAM (128 vectors)

Fig. 7: The SPICE simulation result shows that the maximum likelihood template vector is the vector number 9.

SRAM (128 vectors)