# Nearest-Euclidean-Distance Search Associative Memory with Fully Parallel Mixed Digital-Analog Match Circuitry

M. A. Abedin, Y. Tanaka, A. Ahmadi, T. Koide and H. J. Mattausch

Research Center for Nanodevices and Systems, Hiroshima University, 1-4-2 Kagamiyama, Higashi-Hiroshima 739-8527, Japan Phone: +81-82-424-6265 Fax: +81-82-424-3499 E-mail: abedin@sxsys.hiroshima-u.ac.jp

## 1. Introduction

Calculating the similarity between collections of objects is needed for different applications such as image or voice pattern recognition, codebook-based data compression, routing-table-lookup for network routers, and authentication parts in security system. Associative memory based systems with parallel search capability are advantageous in terms of easier and faster processing with a compact hardware. The associative memory based system performs recognition by calculating the distances between the input pattern and the stored reference patterns. The reference pattern with the minimum distance to the input sample is referred to as the winner. Architectures for fully-parallel winner search according to Hamming distance [1] and Manhattan distance [2] have been proposed based on mixed digital-analog circuits. However, in many real applications the Euclidean distance, which is the exact distance between two vectors, and defined by,

$$D_{e} = \sqrt{\sum_{i=1}^{w} (S_{i} - R_{i})^{2}}$$
(1)

is known to give better results than Hamming and Manhattan distances. In Eq. (1),  $S = [S_1, S_2, \ldots, S_w]$  and  $R = [R_1, R_2, \ldots, R_w]$  are the input and reference vectors, respectively, and  $D_e$  is the Euclidean distance. Most of the circuits proposed for hardware implementation of Euclidean distance calculation [3-5] suffer from a limited operation capability, large dimensions and in particular from the inability to realize a parallel search among multiple reference data.

Here, we present a solution for fully-parallel winner search capability using Euclidean distance, which consists of mixed digital-analog distance calculation circuitry and an analog winner search circuit. It can achieve compact singlechip implementation in conventional CMOS technology as well as short nearest match times up to large distances.

#### 2. Architecture Circuits for Euclidean Distance Search

The block diagram of the proposed associative memory with fully-parallel Euclidean-distance-search capability is shown in Fig. 1. The main functional units are a memory field, which includes search-data storage cells (SC), unit comparators (UC), word comparators (WC), row/column decoder and read/write circuit, and a nearest-match-search circuit consisting of a winner-line-up amplifier (WLA), and a winner-take-all (WTA) circuit.

In general, Euclidean distance calculation requires the square as well as the square root operations. In particular, the square root calculation is a hardware expensive operation. However, pattern matching requires only comparison of the relative magnitude of the distances, and this relative magnitude is unaffected by the square root. Therefore, the circuitry for calculating the square roots can be avoided in the minimum Euclidean-distance search problem.

The memory-field structure of the proposed Euclidean distance search hardware is shown in Fig. 2 in more detail. Here, digital k-bit subtractor and absolute value calculation units compare the W binaries, each with k-bit, in all rows of the memory field, storing the reference data, in parallel with the input data. The digital output of the absolute unitdistance values are then converted into analog currents using the current converters (CC). To realize the CC function the gates of the CC-transistors are connected to the corresponding k-bit output-signal lines of the unit comparator and their drains are connected together to add the analog currents of all CC-transistors. The width of each *CC*-transistor,  $2^{k-1} \times W_0$ , varies depending on its bit position in the binary so as to correctly distinguish the weight of each bit. The analog currents from each CC are then squared using analog current squarer circuits [6]. Fig. 3 shows the simple but efficient circuit which exploits the square-law characteristics of the MOS transistor drain current as a function of the gate voltage, when operated in the saturation region. The advantage of this squarer circuit is that the output current is largely independent of MOS transistor parameters when a long channel length is chosen. Finally, the output currents from all squarer circuits are added to get a Euclidean-distance-equivalent current which constitutes the match line current and is fed to the WLA for further processing. In the nearest match block, the functionality of the WLA circuit is to pre-amplify the match lines and restrict the large variety of possible analog outputs to a small range by self-regulation. The WLA thus amplifies the differences of current signals between winner and losers and regulates the winner signal to a suitable level for further distance amplification by the WTA. The initial job of the WTA circuit is to amplify winner-loser distances by voltage-currentvoltage transformations. In order to reduce the negative effects from fabrication induced miss-match of corresponding transistors in different rows and to improve the reliability for large winner-input distances, 5 stages of a common-source WTA-configuration are used. The final decision circuit in the WTA consists of inverters with an adjusted switching threshold. It generates a "1" for the winner row and a "0" for each loser row.

#### 3. Layout Design and Simulated Performance

The layout in Fig. 4 shows the designed associative memory for nearest Euclidean distance search, which measures 5.24 mm<sup>2</sup> in 0.35 $\mu$ m, 2-poly, 3-metal CMOS technology. One block for handling a 5-bit binary in the

memory field is shown in Fig. 5 and has an area of  $133 \times 28.5$   $\mu$ m<sup>2</sup>. The nearest match unit consumes only 11.1% of the area of the complete layout design. The selector after the nearest match unit has been included for diagnostic purposes. It selects either the final *WTA* results or intermediate signals of the winner search circuit as external outputs.

Fig. 6 shows the simulated nearest-match times as a function of the distance between the winner and input-data word. Here, the data for winner-nearest loser distances of 1 and 4 are plotted. From the simulation results, we obtained very fast winner search times between 120 and 160 nsec at a relatively small average power dissipation of less than 195 mW.

### 4. Conclusion

A fully-parallel associative memory architecture realizing minimum Euclidean distance search has been proposed and its layout has been designed in  $0.35\mu$ m CMOS technology. The 5.24 mm<sup>2</sup> test circuit with 64 reference patterns each with 16 binaries of 5-bit, has high performance, equivalent to a 32-bit processor with 172 GOPS. In this architecture we have used a mixed digital-analog Euclidean-distance calculation circuit with an analog squarer as key element and a fast analog winner search circuit for finding the nearest match. The winner search time and the average



Fig. 1. Block diagram of the mixed digital-analog fully-parallel associative memory for Euclidean distance search.



Fig. 4. Layout of the complete associative-memory for minimum Euclidean distance search, designed in 0.35µm CMOS technology.

power dissipation are less than 160 nsec and 195 mW, respectively. These data are sufficient for applications in the fields of pattern recognition and code-book-based data compression.

## Acknowledgements

This work is supported by Monbukagakusho Scholarship and the 21<sup>st</sup> century COE program "Nanoelectronics for Tera-bit Information Processing", the Ministry of Education, Culture, Sports, Science and Technology, Japanese Government.

# References

- H. J. Mattausch, T. Gyohten, Y. Soda, and T. Koide, *IEEE Journal of Solid-State Circuits*, Vol. 37, pp. 218-227, 2002.
- [2] Y. Yano, T. Koide and H. J. Mattausch, *Extended abstract of the Int. Conf. on Solid State Devices and Materials (SSDM'2002)*, pp. 254-255, 2002.
- [3] P. Heim, F. Krummenacher and E. A. Vittoz, *Electron. Lett.*, vol. 28, pp. 333-334, 1992.
- [4] Y. Tulay and J. S. Marsland, *IEEE Int. Conf. on Neural Networks*, pp. 974-979, 1996.
- [5] M. Freeman, M. Weeks and J. Austin, *IADIS International Conference on Applied Computing*, vol. 2, pp. 329-332, 2005.
- [6] B. D. Liu, C.Y.Huang and H.Y.Wu, *Electron. Lett.*, vol. 30, no. 16, pp. 1287-1288, 1994.



Fig. 2. Concepts for the key-circuit blocks in the proposed associative memory with fully-parallel minimum Euclidean distance search.







Fig. 6. Simulated winner search time with the layout-extracted netlist.