Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition

Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition Shigueo Nomura and José Ricardo Gonçalves Manzan Faculty of Electrical Engineering, Federal University of Uberlândia, Uberlândia, MG, Brasil Abstract This work proposes a set of approaches for improving the multilayer perceptron (MLP) performance on degraded pattern input-output mapping process First, differently to the classical one-per-class approach, our strategy calculates Euclidean distances between the MLP output and target vectors Second, our approach adopts orthogonal bipolar vectors (OBVs) as target values taking advantages of larger Euclidean distance provided by these vectors rather than conventional ones The proposed approaches were applied to MLP training and test in classifying very degraded patterns as input data Experimental results with classical approaches in parallel to the proposed ones are presented for MLP performance comparison purposes The improved MLP with our proposed approaches provided an increase of 93 % on degraded pattern recognition rate Keywords: Artificial Neural Networks, Degraded Pattern, Euclidean Distance, Multilayer Perceptron, Pattern Recognition, Target Vector 1 Introduction Multilayer perceptron (MLP) has been successfully applied to pattern recognition [1][2][3] tasks As a classification system, MLP requires a good approach for analyzing degraded image data, extracting features from these data, generating a set of relevant information, and improving its performance Many efforts have attempted to develop a good method followed by feature extraction systems Surprisingly, it is quite difficult to find investigations focusing on output space for the input-output mapping process of MLPs as shown in Fig 1 Basically, approaches for the MLP multiclass categorization tasks such as one-per-class based on conventional binary or bipolar target vectors with their dimensions equal to the number of classes are classical This work proposes to break away from such classical treatment of the MLP input-output mapping process The main objective of this work is to show that the proposed approaches can considerably improve the degraded pattern recognition performance by MLPs In summary, the approaches adopt the Euclidean distancebased MLP multiclass categorization and the use of orthogonal bipolar vectors (OBVs) as expectation values for learning It is known that MLP performance using bipolar target vectors is already better than performance with binary ones [4], but no other authors work exists, to our knowledge, adopting the Euclidean distance between OBVs and output vectors 2 Usual Approaches on Multiclass Categorization Connectionist algorithms are more difficult to apply to multiclass categorization problems [5] Multiclass categorization problems [5] correspond to tasks of finding an approximate definition for an unknown function f(x) given training examples of the form x i, f(x i ) The unknown function f often takes values from a discrete set of classes c 1, c 2,, c k We can distinguish two approaches to handle these multiclass categorization tasks as follows: We have one-per-class approach when the individual functions f 1, f 2,, f k are learned one for each class To assign a new case p to one of these classes, each of individual function f i is evaluated on p, and the case p is assigned the class j corresponding to the function f j that returns the highest activation [6] This categorization approach is standard for conventional target vectors Distributed output code is an alternative approach pioneered by Sejnowski and Rosenberg [7] in their widelyknown NETtalk system In this approach, each class is assigned a unique binary string of length n; these strings refer to target vectors in MLP Then n binary functions are learned, one for each bit position in these binary strings During training for an example from class i, the desired outputs of these n binary functions are specified by the target vector for class i With Artificial Neural Networks (ANN), these n functions can be implemented by the n output units of a single network A new case p is classified by evaluating each of the n binary functions to generate an n-bit string s This string is then compared to each of the k target vectors, and p is assigned to the class whose target vector is closest, according to some distance measure, to the generated string s

MLP M-dimensional Input space N-dimensional Output space Fig 1: Input-output mapping process of an MLP 3 Motivation An MLP model is trained to learn the nonlinear relationship between M-dimensional input space and N- dimensional output space as shown in Fig 1 In this way, an incoming unknown input M-dimensional vector is transformed into a N-dimensional one and it can be placed in any point from the space The motivation of this work is to improve the MLP multiclass categorization performance as pattern recognition process by transforming an input vector representing a degraded pattern into an output vector representing a class The increase of the generalization ability of MLP models leads to their recognition performance improvement One goal is to find a topology design strategy to provide a network with a higher ability to generalize Many traditional investigations have concentrated on improving the representation of input vectors On the contrary, this work has concentrated on the alternative representation of the desired outputs (OBVs as target vectors) to adjust weights that minimize error during the supervised training process A purpose is to influence on the multidimensional error-performance surface [8] that is constructed in the supervised training Since the categorization based on Euclidean distance adopted in this work depends on the distance between vectors to classify unknown input data, it is expected that the MLP performance improves with the proposed approaches The orthogonality of new target vectors can provide larger output space rather than conventional vectors Figure 2 presents high interference zones between pattern categorization subspaces due to the conventional target vectors On the other hand, we can verify low interference zones between pattern categorization subspaces in Fig 3 due to the use of new target vectors Our hypothesis is that the approaches can lead to enhanced MLP input-output mapping even when input is degraded and slightly different from the training examples 4 Work Proposal A set of approaches including the input-output mapping process of MLPs based on Euclidean distance measure adopting OBVs as new target vectors is proposed in this work According to Dekel and Singer [9], the objective in multiclass categorization problems is to learn a classifier that accurately assigns labels (target vectors) to instances (input vectors) where the set of labels is of finite cardinality and contains more than two elements In this MLP application for multiclass categorization, after every presentation of an instance V (input vector in M- dimensional space of Fig 1), the Euclidean distance of the output vector in N-dimensional space was calculated with all labels (target vectors in output space) for each of the classes The class whose representation by the label had the smallest Euclidean distance with the output vector would be chosen as the winner to assign to instance V In addition to the Euclidean distance-based approach, we propose the use of OBVs (defined in the next section) as target vectors 5 Definition of Vectors The following defined vectors represent target values used in our experiments for the MLP input-output mapping analysis 51 Conventional Bipolar Vectors (CBVs) The conventional bipolar vectors (CBVs) with n components for representing p th pattern in n patterns are defined by Eq (1): p 1 n p {}}{{}}{ V p = ( 1,, 1, 1, 1,, 1) T, (1) where V p is the CBV for representing the p th pattern, p = 1, 2,, n, n is the number of patterns or components In case of recognition problem for 10 digits, the digit 0 is defined as a 10 th digit by Eq (2): 9 {}}{ V 0 = ( 1,, 1, 1) T (2)

Input pattern 1 Target vector 1 Interferencezones Target vector 2 Input pattern 2 MLP Target vector 3 Input pattern m Categorization subspace Target vector n Fig 2: High interference zones between categorization subspaces in case of using conventional target vectors vector 1 Input pattern 1 Input pattern 2 Input pattern m MLP Categorization subspace vector 2 vector 3 vector n Interferencezones Fig 3: Low interference zones between categorization subspaces in case of using new target vectors 52 Orthogonal Bipolar Vectors (OBVs) The norm of an orthogonal bipolar vector (OBV) in an Euclidean space R n is given by Eq (3): U = x 12 + x 22 + + x n2 = n, (3) where U = (x 1, x 2,, x n ) T, x i represents a component +1 or 1 for i = 1, 2,, n, and n is the number of components The usual inner product [10] between two vectors U and V in an Euclidean space R n is defined by Eq (4): UV = x 1 y 1 + x 2 y 2 + + x n y n, (4) where V = (y 1, y 2,, y n ) T, and y i represents a component +1 or 1 for i = 1, 2,, n, and n is the number of components Vectors U and V are orthogonal (denoted by U V ) if and only if UV = 0 in Eq (5): U V UV = 0 (5) A simple algorithm [4][11] can be implemented to generate the OBVs with various numbers of components following the above conditions Table 1 presents examples of OBVs with 16 components, representing target values for 10 digits, generated by the mentioned algorithm 6 Experimental Procedure The modeling procedure of an MLP topology for the input-output mapping process of degraded pattern data as shown in Fig 4 is presented in this section The model is to experimentally evaluate the proposed approaches applied to the MLP in degraded pattern recognition tasks We have extracted input data from license plate photos automatically taken by traffic control systems of Uberlândia City in Brasil They were images with such problems as luminosity, contrast, focalization, resolution, and size, all of which required preprocessing able to extract relevant features for pattern recognition process The original preprocessing methods proposed in such previous works as adaptive contrast enhancement [12], adaptive thresholding [13], automatic segmentation and extraction of feature vectors [14] were used

Table 1: Examples of OBVs with 16 components Digit OBV 1 (1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1) 2 (1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1) 3 (1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1) 4 (1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1) 5 (1, -1, 1, -1, -1, 1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1) 6 (1, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1) 7 (1, -1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, -1, 1) 8 (1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1, 1, -1) 9 (1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1, 1, 1, 1, -1, -1) 0 (1, 1, -1, -1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1) 61 Data Representation A two-dimensional set of pixels that represents a pattern is mapped onto an input vector as a set of input neurons For a 20 15 image of each segmented pattern, the top row of 20 pixels is associated with the first 20 neurons, the next row of 20 pixels is associated with the next 20 neurons, and so on So, segmented entities (20 15) are represented by feature vectors with 300 components, and each component in the vector should represent one pixel of the pattern (bipolar value +1) or one pixel of the image background (bipolar value 1) Figure 4 shows the 120-image training set representing digits as input data In this work, the adopted data representation is bipolar since the categorization may be improved if the input is represented in bipolar form and the bipolar sigmoid is used for the activation function [4] The reason is if one factor in the weight connection expression is the activation of the lower unit then units whose activations are null will not learn [4] 62 MLP Topology for Experiments The adopted multilayer neural network in our experiments consists of the architecture with one layer of hidden neurons There are several propositions how to determine the necessary number of hidden neurons for a given problem but they yield contradictory results and have no practical utility [15] Some heuristic rules [15] used to obtain the MLP topology for experiments are as follows: To reduce the number of hidden neurons when the network does not generalize, that is, the error of the output data during the training is small and the error in the test stage is large To increase the number of hidden neurons when the error during the training is large or all the weights for the connections between neurons are of the same order of magnitude In degraded images, the input data contain redundant information which can not be removed by a suitable coding These spurious data can be removed when the hidden layer contains less neurons than the input layer [15] In summary, without further information there is no foolproof method for setting the exact number of hidden neurons before training stage [4] Such usual strategies in ANN [8] as one parameter keeping and the variation of remaining parameters defined the appropriate topology of MLP Conventional experiments get adequate topology for classifying input digits (20 15) represented by the 300-dimensional feature vectors Our experimental MLP model consists of 300 neurons in the input layer The adequate number of neurons in the hidden layer is set according to each experiment The number of neurons in the output layer is defined by the target vector type or its size selected for each experiment 63 Training Stage The standard backpropagation algorithm [4] is used as the training algorithm of each MLP model Since all experimental target vectors are bipolar, the adopted activation function is the typical bipolar sigmoid [4], which has a range of ( 1, 1) Initial weights are generated as random values between 025 and 025 The learning rate parameter is set as 002 The stop criterion for the training algorithm is to require that the maximum value of the average squared error be equal to or less than the tolerance as indicated within each graph of Figs 5 7 Training data set is constituted by 120 pattern-images not belonging to the test data set It contains input patterns for the MLP model training to classify digits extracted from license plates into ten categories Each category is represented by 12 input patterns 64 Test Stage The classification rate is calculated by Eq (6):

Fig 4: A sample of 120 degraded patterns used as input data for the MLP training Classification rate (%) 724 722 720 718 716 718 717 717 717 720 720 720 721 3751 720 5496 714 712 711 712 2324 2864 710 708 706 103 149 278 341 443 643 1220 704 30 20 10 08 06 04 02 01 008 006 004 Tolerance (x10-3 ) Fig 5: MLP performance using classical one-per-class approach The line denotes the epoch quantity evolution

Classification rate (%) 801 799 797 795 797 798 799 800 796 796 63 93 795 795 126 793 791 43 22 27 32 38 38 30 25 20 17 11 07 05 Tolerance (x10-2 ) Fig 6: MLP performance using OBVs with 16 components as target vectors The line denotes the epoch quantity evolution Classification rate (%) 816 814 812 810 808 806 814 813 195 812 812 811 811 809 809 102 69 53 28 32 37 43 40 35 30 25 20 15 10 05 Tolerance (x10-2 ) Fig 7: MLP performance using OBVs with 64 components as target vectors The line denotes the epoch quantity evolution cr = N i c i N, (6) where cr is classification rate, N is number of test patterns, and c i is defined as c i = { 1 : pi = r i 0 : p i r i, (7) where p i is a classified pattern (output of the MLP model), and r i is the corresponding category (correct response) Test data set consisted of 1352 images containing the digits 0, 1,, 9 extracted from license plates after their preprocessing [12][13][14][16] A trained MLP model was applied to classify those extracted digits into ten categories 7 Experiments and Results The experiments consisted of using CBVs and OBVs defined in Section 5 as target vectors We have evaluated the influence of Euclidean distancebased approach on MLP training and classification performance improvement using various target vectors In other words, the experiments compared the proposed Euclidean distance-based approach with the classical oneper-class categorization approach in terms of influence on MLP performance An MLP model with a topology of 100 hidden neurons and 10 output neurons was trained and tested using CBVs for one-per-class approach Figure 5 shows the MLP classification performance applying one-per-class approach for the use of CBVs as target vectors To evaluate the influence of using OBVs as target vectors on the MLP input-output mapping process, we have performed the following tasks: MLP model training for OBVs as target vectors Different topologies according to the OBV sizes (16 or 64 components) were defined

Table 2: Relevant results from the graphs of Figs 5 7 Target vector type Categorization approach Maximum rate % CBV One-per-class 7210 OBV-16 Euclidean distance-based 8000 OBV-64 Euclidean distance-based 8140 Application of the Euclidean distance-based approach for the above trained MLP models The experimental results using OBVs with 16 and 64 components, are respectively shown in Figs 6 and 7 8 Discussion The graph in Fig 5 shows that the classification rate of 721% is provided by the MLP model trained for 3751 epochs with the one-per-class categorization approach Figures 6 7 show the results from the use of OBVs as target values Table 2 presents the highest classification rate per target vector type and categorization approach We can note that the classification rate was increased of 93% by using Euclidean distance-based approach and OBVs as target values Considering the advantage of adopting the Euclidean distance-based categorization approach, we can verify that the MLP performance improved more and more with the use of OBVs rather than conventional target vectors 9 Conclusion A set of approaches to improve the MLP performance on degraded pattern recognition searching for more generalized and well-trained model was proposed Basically, the Euclidean distance-based approach was combined to the use of OBVs as new target vectors for the MLP training and test with very degraded pattern data We compared the experimental results applying the proposed approaches with the results applying the classical oneper-class approach Comparison results showed that the proposed approaches considerably improved (increase of 93% on classification rate) the MLP performance on degraded pattern recognition tasks In summary, we realized that the proposed approaches led to higher classification rate of very degraded pattern data by MLPs compared to the classical one-per-class approach In terms of degraded pattern recognition performance, we have gotten an enhanced MLP input-output mapping with the proposed Euclidean distance-based approach using OBVs as new target vectors References [1] A R Webb and K D Copsey Statistical Pattern Recognition John Wiley & Sons, third edition, 2011 [2] J Chu, I Moon, and M Mun A real-time EMG pattern recognition system based on linear-nonlinear feature projection for a multifunction myoelectric hand IEEE Transactions on Biomedical Engineering, 53(11):2232 2239, Nov 2006 [3] Y X Zhang Artificial neural networks based on principal component analysis input selection for clinical pattern recognition analysis Talanta, 73(1):68 75, Aug 2007 [4] L V Fausett Fundamentals of Neural Networks: Architectures, Algorithms, and Applications Prentice Hall, Englewood Cliffs, NJ, 1994 [5] T G Dietterich and G Bakiri Solving multiclass learning problems via error-correcting output codes Journal of Artificial Intelligence Research, 2:263 286, 1995 [6] N J Nilsson Learning Machines McGraw-Hill, New York, 1965 [7] T J Sejnowski and C R Rosenberg Parallel networks that learn to pronounce english text Complex Systems, 1:145 168, 1987 [8] S Haykin Neural Networks: A Comprehensive Foundation Prentice Hall, New Jersey, second edition, 1999 [9] O Dekel and Y Singer Multiclass learning by probabilistic embeddings In Advances in Neural Information Processing Systems, volume 15, pages 945 952 MIT Press, 2002 [10] B Noble and J W Daniel Applied Linear Algebra Prentice Hall, Englewood Cliffs, New Jersey, second edition, 1977 [11] S Nomura, K Yamanaka, O Katai, H Kawakami, and T Shiose Improved MLP learning via orthogonal bipolar target vectors Journal of Advanced Computacional Intelligence and Intelligent Informatics, 9(6):580 589, Nov 2005 [12] S Nomura, K Yamanaka, O Katai, and H Kawakami A new method for degraded color image binarization based on adaptive lightning on grayscale versions IEICE Trans on Information and Systems, E87- D(4):1012 1020, 2004 [13] S Nomura and K Yamanaka New adaptive methods applied to binarization of printed word images, In N Younan, editor, Proceedings of the Fourth IASTED International Conference Signal and Image Processing, pages 288 293, Kauai, USA, 2002 [14] S Nomura, K Yamanaka, O Katai, H Kawakami, and T Shiose A novel adaptive morphological approach for segmenting characters in degraded images Pattern Recognition, 38:1961 1975, Nov 2005 [15] (2012) Neuronal networks: The network [Online] Available: http://wwwandreas-mielkede/nn-en-4html [16] S Nomura and K Yamanaka New adaptive approach based on mathematical morphology applied to character segmentation and code extraction from number plate images, In Proc of 6 th World Multi- Conference on Systemics, Cybernetics and Informatics, volume IX, Florida, USA, Jul 2002 Acknowledgment The authors would like to thank PROPP-UFU (project 72/2010), CAPES (MINTER UFU-IFTM), and FAU-UFU for supporting this work