Analysis of Learning Paradigms and Prediction Accuracy using Artificial Neural Network Models

Analysis of Learning Paradigms and Prediction Accuracy using Artificial Neural Network Models Poornashankar 1 and V.P. Pawar 2 Abstract: The proposed work is related to prediction of tumor growth through ANN algorithms with besides statistical approach. The algorithms work for unknown and unexpected relations which can be uncovered by exploring of data. This paper evaluates the efficacy of Neural Network algorithms and its performance with respect to various learning parameters for the tumor growth of epidermoid carcinoma in mouse. This experiment is conducted to analyze the prediction accuracy and stability of neural networks using eed orward, Back Propagation and Self Organizing Maps by applying different learning rules and activation functions. The combination of ANN algorithms by the analyses, it is observed that among the three network models, the MLP with Back Propagation which stabilized in lesser number of epochs is better amongst the three models. Brain computer interaction through analysis of algorithms for prediction tumor growth is proven. Keywords: Artificial Neural Network (ANN), eed orward, Kohonen Self- Organizing Map (SOM), Back Propagation, Epochs. 1. INTRODUCTION Brain computer interaction through Neural Networks is very challengeable area of the research. The neural network algorithms using for correct analysis of data sets. The proposed research work is related to neural network algorithms for prediction of tumor growth of mouse. The concept of data mining with neural network is very effective approach for prediction of tumor growth. The Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 41

Poornashankar and V.P. Pawar combination has proven their predictive power thorough comparison with other statistical techniques using real data sets [1]. Neural networks uses set of processing elements analogous to neurons in the brain. These elements are interconnected in the network that can identify patterns in data [9]. Algorithms based on neural networks have a lot of applications in knowledge engineering especially in human decision-making or modeling. The utilization of modern methods based on neural networks theory will be pursued. The performance of the popular Neural Networks will be analyzed with respect to learning rate, momentum, error tolerance, number of epochs and activation function. 2. BACKGROUND Some of cancerous tissues can be very aggressive in tumor. It is very important to identify as early as possible. 10-30% visible abnormalities usually are not detected due to technical or human error. The effectiveness of early detection has been proven to reduce a lot of mortality among cancer patients [8]. As a proof 80% of American Society detected cases are still in early stage, but the mortality among them is only 3% in the year 2006 due to early detection and improved treatment. Hence it is mandatory to computerize the cancer diagnose system. Prediction of tumor growth through ANN helps to determine the severity and rate of growth of cancer tissues. Neural Networks are commonly used in predicting tumor growth in the medical field, as they handle large database and its ability to learn and stabilize after training [1]. An Artificial Neural Network was constructed with the help of tumor growth data of 4 mice and tested in three different algorithms viz. eed orward Networks, Multilayer Back Propagation and Kohonen Self Organizing Maps. The efficacy of different neural algorithms has been analyzed with respect to learning parameters and other statistics. The results obtained reveals that Multilayer Back Propagation algorithms is better than the other two algorithm for this set of data with respect to time, number of iterations and accuracy. 42 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

3. COMPARISON O NEURAL NETWORK ALGORITHMS 3.1 Simple eed orward Networks eed forward networks are used in situations when we can bring all of the information to bear on a problem at once and we can present it to the neural network. eed forward networks have layers of processing elements. A layer of processing elements makes independent computations on data that it receives and passes the results to another layer. The next layer may in turn make its independent computations and pass on the results to yet another layer. inally, a subgroup of one or more processing elements determines the output form the network. The data flows from input layer through 0, 1 or most succeeding hidden layers and then to the output layer. It is the definition of connection topology and data flow. 3.2 Multilayer eed orward Neural Networks The multilayer perceptron (MLP) is one of the most widely implemented neural network topologies. Multilayer perceptrons represent the most prominent and well researched class of ANNs in classification, implementing a feed forward, supervised and hetero-associative paradigm. or static pattern classification, the MLP with two hidden layers is a universal pattern classifier. In other words, the discriminant functions can take any shape, as required by the input data clusters. Moreover, when the weights are properly normalized and the output classes are normalized to 0/1, the MLP achieves the performance of the maximum a posteriori receiver, which is optimal from a classification point of view. In terms of mapping abilities, the MLP is believed to be capable of approximating arbitrary functions. This has been important in the study of nonlinear dynamics and other function mapping problems. MLPs are normally trained with the backpropagation algorithm. In fact the renewed interest in ANNs was in part triggered by the existence of backpropagation. The backpropagation rule propagates the errors through the network and allows adaptation of the hidden PEs. Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 43

Poornashankar and V.P. Pawar Two important characteristics of the multilayer perceptron are: Its nonlinear processing elements (PEs) which have a nonlinearity that must be smooth (the logistic function and the hyperbolic tangent are the most widely used); and their massive interconnectivity (i.e. any element of a given layer feeds all the elements of the next layer). The multilayer perceptron is trained with error correction learning, which means that the desired response for the system must be known. In pattern recognition this is normally the case, since we have our input data labeled, i.e. we know which data belongs to which experiment. Multi layer Neural Networks are powerful tools used for wide range of purposes like diagnosis of malignant of Breast Cancer from Digital Mammograms [1], Resolving Multi ont Character Confusion with Neural Networks [4], ANN Model in Human Decision Making [7]. 3.3 The Kohonen Self-organizing Map The connection weights are assigned with small random numbers at the beginning. The incoming input vectors presented by the sample data are received by the input neurons. The input vector is transmitted to the output neurons via the connections. The output neurons with the weights most similar to the input vector became active. In the learning stage, the weights are updated following Kohonen s learning rule. The weight update only occurs for the active output neurons and their topological neighbours. The neighborhood starts large and slowly decreases in size over time. Because the learning rate is reduced to zero the learning process eventually converges. After learning process similar set of items activate the same neuron. SOM divides the input set into subsets of similar records. SOM is a dynamic system, which learns abstract structures in high dimensional input space using low dimensional space for representation. Properly designed SOM can be used to organize the high dimensional clusters in a low dimensional map. Using SOM 44 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

researches are being conducted in Diagnosis of Mammographic Images [1], Pattern Analysis and Machine Intelligence [7], A Convolution Neural Network Classifier with Spatial Domain and Text Images [8]. 4. EXPERIMENT The tumor growths of epidermoid carcinoma of 4 mice and one patient over various time-scales have taken and these were predicted up to 10 weeks by various neural network algorithms. The objective of this research is to analyze the prediction accuracy of neural networks using eed forward, Back propagation and self organizing maps by applying different learning rules and activation functions. In the first step, the construction of the network was determined with its structure and weight; excessive numbers of training data were selected. Training is done with the specified number of epochs, preferably a large number until the value of MSE for the trained data is nearer to zero or reaching at a stable point. Different threshold functions are being selected for the sake of scaling down the activation and mapping into a meaningful output across each layer's operations. The sigmoid function, a step function and hyperbolic functions have been taken as activation functions and are tested with each algorithm. The Performance Measures access point of the Error Criterion component provides four values that can be used to measure the performance of the network for a particular data set. The number of Epochs specifies how many iterations (over the training set) will be done if no other criterion kicks in. The Error Change contains the parameters used to terminate the training based on mean squared error. The larger the number of epochs, more stable will be the network. The training terminates as a function of the desired error level. The actual data and expected data are compared with its Mean Square Error (MSE), Normalized MSE, Mean absolute Error, Minimum and Maximum Absolute Error and Linear correlation coefficient. Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 45

Poornashankar and V.P. Pawar 5. TOOLS AND METHODOLOGY This experiment has been conducted using Neuro solutions tool version 5.05. This tool has a provision to simulate all neural network topologies using learning algorithms and generate modular networks. It has the capability to include more neurons and hidden layers yielding fast and double precision calculations. It supports for both Recall and Learning networks using Simple protocol for sending the input data and retrieving the network response. In order to analyze the learning parameters and efficacy of the different neural models the following steps are followed [1] The appropriate Neural Network algorithm has been identified based on prediction. [2] The input data file containing the training data with appropriate columns were specified. [3] The input variables, predictor variables have been identified for the data analysis. [4] The validation procedure has been selected for training the network. This procedure identifies the error while training the data. [5] The testing is performed to compare the network output with the desired output. [6] The required exemplars and hidden layers are selected for processing the network. [7] Number of processing elements and learning rules are given as input. [8] The learning parameters like step size, momentum and number of epochs have been entered to yield the desired output from the network.. [9] The weights of the processing elements are updated. [10] The termination criteria for the network are selected either based on number of epochs or based on the error level of Mean Squared Error (MSE) till the network stabilizes. 46 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

[11] The performance of Network is analyzed with respect to learning parameters. [12] The predicted data using these models and the outcomes of the projected results are plotted in the graph. [13] The robustness and completeness of every algorithm is analyzed through the comparative statement. 6. EXPERIMENTAL RESULT The assessment of accuracy or different neural network models have been compared and analyzed based on its MSE and correlation coefficient between the actual and the expected to see how close the predicted to its true outcome. Make it table, Histogram Properly & Try to Show Result in Significant form using SD/Mean/Variance or T-Test (Statistical Technique) etc.. Statistics Mouse No. Axon MLP Axon SOM Tanh 5821 210 71 999 Epochs 5854 359 88 999 5873 283 104 999 5894 426 76 999 Statistics Mouse No. Axon MLP Axon SOM Tanh 5821 0.013193 0.013193 0.000106 Minimum MSE 5854 0.003891 0.003891 0.000301 5873 0.002395 0.002395 0.000129 5894 0.012953 0.012953 0.001290 Statistics Mouse No. Axon MLP Axon SOM Tanh 5821 0.065526 0.065526 0.000527 Minimum MSE 5854 0.023482 0.023482 0.001813 5873 0.012408 0.012408 0.000668 5894 0.078197 0.078197 0.007773 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 47

Poornashankar and V.P. Pawar Statistics Mouse No. Axon MLP Axon SOM Tanh 5821 104.8796 104.8796 8.327457 Mean Absolute 5854 78.5534 78.5534 25.38266 Error 5873 46.60806 46.60806 12.30266 5894 146.6102 146.6102 44.63116 Statistics Mouse No. Axon MLP Axon SOM Tanh 5821 0.974091 0.974091 0.999790 Correlation 5854 0.990774 0.990774 0.999246 Coefficient 5873 0.995560 0.995560 0.999768 5894 0.985308 0.985308 0.996409 Epochs Minimum MSE Normalized MSE 48 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

Mean Absolute Error Mean Absolute Error or the class of supervised learning there are three basic decisions that need to be made: Choice of the error criterion, how the error is propagated through the network, and what constraints (static or across time) one imposes on the network output. 7. DISCUSSION rom the above research it can be concluded that the final output of ANN can be affected by Network Architecture, learning Algorithm and other parameters as well as input. eed forward being the simplest of all ANN algorithms, stabilized well yielding better results with in more time. A MLP with Back propagation provides good result in predicting and classification. The only thing is to find out the best architecture, number of hidden layers and distribution function. Initial weights are important as it affects the ability and performance. Kohonen s SOM takes high dimensional input, Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 49

Poornashankar and V.P. Pawar clusters it, and retains some topological order of the output with more iteration. 7.1 eed orward Network Generalized feed forward networks are connections which can jump over one or more layers. A feed forward network has information flowing in forward direction; the state of each neuron is calculated by summing the weight values that flow into a neuron. Weights are usually determined by training algorithm which adjusts the weights appropriately to achieve the desired response. This feed forward network is generated, trained with three transfer methods viz., Axon, hyperbolic and sigmoid and tested with different learning rules like step and momentum. Among the three different approaches, the values of MSE, Normalized MSE, Mean Absolute Error were minimum indicating the near match of the actual and the predicted output supported by the value of correlation coefficient for feed forward network constructed with transfer axon and learning rule step. 7.2 Multilayer Perceptron with Back Propagation Multi Layer Perceptron using back propagation can solve any problem that a generalized feed forward network can solve but, it requires hundreds of times more training epochs than the generalized feed forward network containing the same number of processing elements. The stability of MLP network with back propagation and its output precision is observed using the axon, sigmoid and hyperbolic transfer functions with learning rule step and momentum. Among the three different approaches, the values of MSE, Normalized MSE, Mean Absolute Error were minimum 50 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

indicating the near match of the actual and the predicted output supported by the value of correlation coefficient for MLP network constructed with transfer axon and learning rule step. 7.3 Kohonen s Self Organizing Maps This network s key advantage is the clustering produced by the SOM which reduces the input space into representative features using a self-organizing process. Hence the underlying structure of the input space is kept, while the dimensionality of the space is reduced. Their main advantage is that they are easy to use, and that they can approximate any input/output map. The key disadvantages are that they train slowly, and require lots of training data (typically three times more training samples than network weights). A SOM network was constructed with transfer function hyperbolic and the step learning rule. The minimum MSE reached only at the 1000th epoch. The network needs more number of epochs to become stable. 8. CONCLUSION The RDBMS tools have the limitations with its structure whereas in the ANN has significant improvement in the dynamic environment due to its learning capability. This research reveals the prediction capacity of artificial neural networks using carcinoma data. The assessment of accuracy or different neural network models have been compared and analyzed based on its MSE and correlation coefficient between the actual and the expected to see how close the predicted to its true outcome. All outcome of the analysis were performed using cancer data, there were no significant difference within the results, the MLP s accuracy seems to be better than the other two tested network Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 51

Poornashankar and V.P. Pawar models, (because of lesser number of Epochs) but further optimization is required in MLP to obtain encouraging results and improve the performance of the network. The performances of the other SOM networks were not satisfactory and the optimization of the training could not be achieved within the specified number of epochs. The results obtained were not encouraging in the pilot study, so its feasibility is not considered. When neural networks are used in data warehouse, the output of the process is a trained model which can be used to retrieve various patterns in real time. Based on the training and testing of the proposed model, it can be proved that artificial neural networks can mine the data set with predictive accuracy and improved performance. Neural networks have been shown to be particularly useful in solving problems where traditional artificial intelligence techniques involving symbolic methods have failed or proved inefficient. REERENCES Research Papers [1] Comparing Artificial Neural Networks to Other Statistical Methods for Medical Outcome Prediction By Harry B. Burke, MD., Ph. D., David B. Rosen, Ph. D., Philip H. Goodman, M.D. IEEE/1994/(pp. 2213-2216). [2] A Neural Network Made of a Kohonen s SOM Coupled to a MLP Trained Via Backpropagation for the Diagnosis of Malignant Breast Cancer from Digital Mammograms By Taio C. S. Santos-Andrk and Anttinio C. Roque da Silva. IEEE/1999/(pp. 3647-3650). [3] Some Extensions of a New Method to Analyze Complete Stability of Neural Networks, By Mauro orti. IEEE/2002/(pp. 1230-1238). [4] The Training Strategy for Creating Decision Tree, By Zhi-Bo Liu. IEEE/ 2003/(pp. 3238-3243). [5] Self-Organized ormation of Topologically Correct eature Maps, By Kohonen T. Biological Cybernetics/1992/43/(pp. 59-69). [6] The Self-Organizing Map By Kohonen (1998). Proceedings of IEEE/78(9)/ (pp. 1464-1480). [7] Neural Networks in Data Mining By A. Vesely. Agriculture Econ.IEEE/2003/ (Page 427-431). 52 Journal of Neural Systems Theory and Applications, 1(1) January-June 2011

[8] Computerized Breast Cancer Diagnosis with Genetic Algorithms and Neural Network, By Afzan Adam and Khairuddin Omar. Books [9] Knowledge Discovery Through Self Organizing Maps: Data Visualization and Query Processing. Knowledge and Information Systems 4(1), By Wang S. Wang H., January 2002. [10] Data Mining by Pieter Adriaans and Dolf Zantige. [11] Data Mining with Neural Networks by Joseph P. Bigus. [12] Data Mining Techniques by Arun K Pujari. [13] The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani and Jerome riedman. [14] Data Mining with Computational Intelligence by Lipo Wang and Xiuju u. [15] Introduction to Artificial Neural Systems by Jacek M. Zurada. Journal of Neural Systems Theory and Applications, 1(1) January-June 2011 53