# A Fully-Parallel Self-Learning Analog Support Vector Machine Employing Compact Gaussian-Generation Circuits

Renyuan Zhang and Tadashi Shibata

Department of Electrical Engineering and Information Systems, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan

Phone: +81-3-5841-6656 E-mail: tyoninen@if.t.u-tokyo.ac.jp, shibata@ee.t.u-tokyo.ac.jp

# 1. Introduction

Human is much superior to traditional VLSI processors in cognitive functions since brains can self-learn from samples. Mimicking such self-learning process, several VLSI implementations have been explored for pattern recognition based on the Support Vector Machine (SVM) algorithm [1], which was originally developed for software programs. The SVM using a Gaussian function (GF) kernel is one of the most powerful classifiers [2], but GF is very expensive computation in silicon. Therefore, analog circuits were developed to generate GF with high speed and compact area. An analog fully-parallel SVM with very high training speed was proposed in [3]. However, this approach requires a large chip area for the array of GF circuits to carry out kernel computation. Thus, the results were only shown by circuit simulation for four sample vectors of only two dimensions. Another row-parallel structure [4] reduced the chip size greatly but made training slower since it requires multi-iterations of sample-serial learning (the chip was built for only two dimension vectors). Furthermore, in both approaches above, it is very difficult to increase the vector dimension because the error increases multiplicably when the system is extended to high dimension vectors.

A Gaussian function generation circuit extendable to higher dimensions has been developed in this work to build a new type of compact fully-parallel SVM system for recognition of 64-D image vectors. Learning process is accomplished autonomously in a self feed-back configuration. Therefore, no iterative operation is required. In addition, the chip size is much smaller than the approach in [3]. The chip was designed in a 0.18um CMOS process and it is now under fabrication.

# 2. System structure

.1 SVM algorithm applied in this work

SVM is a binary classifier and the classifying task is calculating the following function for an input vector  $\mathbf{X}$ :

$$f(\mathbf{X}) = sign[\sum \alpha_i \cdot y_i \cdot e^{-(\mathbf{X} - \mathbf{X}_i)^2} + b] \quad .$$

Where,  $\mathbf{X}_i$  is the *i*-th training sample with class label of  $y_i \in \{-1,1\}$ ; and *b* is a constant bias (which is set to 0 since it has a negligible effect on the performance of SVM with Gaussian kernel [5]). The self-training process is to determine  $\alpha$  -values by backward propagation. In this work, the gradient-descent algorithm was chosen using the following updating rule [4]:

$$\alpha_i \leftarrow \min(C, \max(0, 1 - y_i \sum_{i \neq j} \alpha_j y_j e^{-(\mathbf{X}_i - \mathbf{X}_j)^2})),$$

where C is the regularization parameter.

2.2 System organization

Assuming the total number of training samples N, the system organization is illustrated in Fig. 1(a) along with the proof-of-concept chip layout in Fig. 1(b). N sets of Euclidean distance calculation circuits were constructed in block I to compute the distances between the vector  $\mathbf{X}_{i}$  and all other samples in parallel. Each cell in block  $\mathbf{I}$ contains a capacitor (as an analog memory) and an exponential generation circuit. The Euclidean distance values are stored in array  $\mathbf{I}$  row by row as voltages. As a result, a fully-parallel array of Gaussian kernels has been implemented in such a small area even for high dimensions (64 dimensions of sample vectors of the presented work). During the training process, the  $\alpha$  -values in block III are fed-back to block II and the learning process proceeds in a fully-parallel manner. Therefore, the training process is accomplished autonomously. This is far faster than the clock-based sample-serial iteration approach in [4]. The circuits of  $\alpha$  adjuster are current mirror based adders/subtracters.



Fig. 1(a) System organization of our proposed SVM trainer/classifier and (b) its chip layout for 16 training samples with 64 dimensions.



Fig. 2 64-D Gaussian function kernel generation circuit: (a) Euclidean distance calculation circuit and (b) exponential function generation circuit.

The chip was designed in a standard 0.18um CMOS process with a chip size of  $2.5 \times 2.5mm$ .

Figure 2 shows a 64-D Gaussian function kernel circuit developed in this work, composed of a Euclidean distance circuit (a) and an exponential generation circuit (b). The former is composed of 64 difference-squaring circuits and an I-V converter that collects all the current from 64 squaring circuits. The output voltage  $V_{diff}$  is given by:  $V_{diff} = V_0 - \sum \frac{|V_1 - V_2|^2}{\sigma}$ .

Where 
$$V_0 = \frac{V_{dd} + V_{bias} - 2|V_{thP}|}{2}$$
 and  $\sigma = \frac{(V_{dd} - V_{bias})(\sqrt{K_N} + \sqrt{K_P})^2}{K_N}$ .

 $K_N$  and  $K_P$  represent n-type and p-type MOS transistor K-factors respectively. The result is given to the exponential generation circuit of (b). The width of Gaussian function can be changed by adjusting  $V_{bias}$ , which makes this system flexible when the number of dimensions changes. Finally, the output is given in current mode as  $I_{out}$  shown in Fig. 2(b):

$$I_{out} \approx \frac{I_t}{2} e^{-\sum \frac{|V_1 - V_2|^2}{\sigma}}$$
, if we set  $V_{ref} = V_0$ 

 $I_i$  reflects the  $\alpha$  -values which are controlled by  $\alpha$  adjusters in block  $\mathbbm{I}$ .

## 3. Simulation results

The performance-robustness against fabrication process variations and the programmability of our proposed Gaussian circuit is verified by HSPICE simulation as shown in Fig. 3. We randomly set the parameters (mainly threshold voltages) for all of the devices with variations of  $\pm$  5%; select five elements from 64 dimensions randomly and sweep them independently. According to the simulation results in Fig. 3, the peak-height values of Gaussian feature can be scaled linearly by changing  $I_t$  with acceptable fluctuations.



Fig. 3 Output current against input voltage of our proposed 64-D GF generation circuit with process variations of  $\pm 5\%$ .



Fig. 4 Nanosim simulation results employing 16 training sample images and 8 test images.

The COIL-20 (Columbia Object Image Library) database was used to verify training/classifying performances. The images from database were pre-processed and converted into 64-D feature vectors by the PPED method [6] (Projected Principle Edge Distribution). We selected eight images from the "obj16" category of COIL-20 as training samples of class "a"; other eight images from "obj10" category as samples of class "b". After the training process, eight testing samples are used to verify classification correction. The signal in lower window of Fig. 4 represents the classification results in binary. According to the Nanosim simulation results, our proposed system recognized and classified all of eight testing images correctly.

### 4. Conclusions

An analog high-dimensional Gaussian function generation circuit has been developed in this work, which is robust against process variations. Based on this GF circuit, we built a fully-parallel self-training SVM system with high training speed and compact chip size. According to the simulation results, our proposed SVM system recognized and classified all of test images from an actual database correctly. A  $2.5 \times 2.5mm$ VLSI chip for this system is under fabrication and the measurement results would be presented at the time of the conference.

### References

- [1] R. Genov, et. al, IEEE Tran. Neural Network (2003) 1426.
- [2] V. Vapnik, The Nature of Statistical Learning Theory (1995).
- [3] S.-Y. Peng, B. A. Minch, and P. E. Hasler, ISCAS (2008) 860.
- [4] K. Kang and T. Shibata, *IEEE Tran. Circuits and Systems* (2010) 1513.
- [5] D. Anguita, et al., IEEE Tran. Neural Network (2003) 993.
- [6] M. Yagi, et. al, IEEE Tran. Neural Network (2003) 1144.