Adaptive Multi-layer Neural Network Receiver Architectures for Pattern Classification of Respective Wavelet Images

Similar documents
Neural Network based Digital Receiver for Radio Communications

MINE 432 Industrial Automation and Robotics

A comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron

Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw

Application of Multi Layer Perceptron (MLP) for Shower Size Prediction

A Comparison of Particle Swarm Optimization and Gradient Descent in Training Wavelet Neural Network to Predict DGPS Corrections

Current Harmonic Estimation in Power Transmission Lines Using Multi-layer Perceptron Learning Strategies

Analysis of Learning Paradigms and Prediction Accuracy using Artificial Neural Network Models

CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK

A COMPARISON OF ARTIFICIAL NEURAL NETWORKS AND OTHER STATISTICAL METHODS FOR ROTATING MACHINE

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

NEURAL NETWORK BASED MAXIMUM POWER POINT TRACKING

FACE RECOGNITION USING NEURAL NETWORKS

Target Classification in Forward Scattering Radar in Noisy Environment

A Novel Fuzzy Neural Network Based Distance Relaying Scheme

EE216B: VLSI Signal Processing. Wavelets. Prof. Dejan Marković Shortcomings of the Fourier Transform (FT)

Problem Sheet 1 Probability, random processes, and noise

Improved Detection by Peak Shape Recognition Using Artificial Neural Networks

An Hybrid MLP-SVM Handwritten Digit Recognizer

Neural Filters: MLP VIS-A-VIS RBF Network

Constant False Alarm Rate Detection of Radar Signals with Artificial Neural Networks

Harmonic detection by using different artificial neural network topologies

Channel Models. Spring 2017 ELE 492 FUNDAMENTALS OF WIRELESS COMMUNICATIONS 1

Frequency Hopping Spread Spectrum Recognition Based on Discrete Fourier Transform and Skewness and Kurtosis

Tadeusz Stepinski and Bengt Vagnhammar, Uppsala University, Signals and Systems, Box 528, SE Uppsala, Sweden

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast

NEURALNETWORK BASED CLASSIFICATION OF LASER-DOPPLER FLOWMETRY SIGNALS

DERIVATION OF TRAPS IN AUDITORY DOMAIN

Application of Artificial Neural Networks System for Synthesis of Phased Cylindrical Arc Antenna Arrays

Lecture 7/8: UWB Channel. Kommunikations

The Basic Kak Neural Network with Complex Inputs

Pulse Compression Techniques of Phase Coded Waveforms in Radar

DIAGNOSIS OF STATOR FAULT IN ASYNCHRONOUS MACHINE USING SOFT COMPUTING METHODS

Sonia Sharma ECE Department, University Institute of Engineering and Technology, MDU, Rohtak, India. Fig.1.Neuron and its connection

IBM SPSS Neural Networks

Background Pixel Classification for Motion Detection in Video Image Sequences

Near-Optimal Low Complexity MLSE Equalization

Antennas and Propagation. Chapter 6b: Path Models Rayleigh, Rician Fading, MIMO

Prediction of Cluster System Load Using Artificial Neural Networks

Digital Modulation Recognition Based on Feature, Spectrum and Phase Analysis and its Testing with Disturbed Signals

Simulation of Outdoor Radio Channel

Performance Analysis of Equalizer Techniques for Modulated Signals

Narrow- and wideband channels

MURDOCH RESEARCH REPOSITORY

Narrow- and wideband channels

Prediction of airblast loads in complex environments using artificial neural networks

Chapter 2 Channel Equalization

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

CHAPTER 1 INTRODUCTION

ANALYSIS OF CITIES DATA USING PRINCIPAL COMPONENT INPUTS IN AN ARTIFICIAL NEURAL NETWORK

TCM-coded OFDM assisted by ANN in Wireless Channels

Evoked Potentials (EPs)

Student: Nizar Cherkaoui. Advisor: Dr. Chia-Ling Tsai (Computer Science Dept.) Advisor: Dr. Eric Muller (Biology Dept.)

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm

Perspectives on Intelligent System Techniques used in Data Mining Poonam Verma

Decriminition between Magnetising Inrush from Interturn Fault Current in Transformer: Hilbert Transform Approach

SPLIT MLSE ADAPTIVE EQUALIZATION IN SEVERELY FADED RAYLEIGH MIMO CHANNELS

Chapter 1 INTRODUCTION TO SOURCE CODING AND CHANNEL CODING. Whether a source is analog or digital, a digital communication

UNIVERSITY OF SOUTHAMPTON

A Neural Network Approach for the calculation of Resonant frequency of a circular microstrip antenna

Adaptive STFT-like Time-Frequency analysis from arbitrary distributed signal samples

EENG473 Mobile Communications Module 3 : Week # (12) Mobile Radio Propagation: Small-Scale Path Loss

AUTOMATIC MODULATION RECOGNITION OF COMMUNICATION SIGNALS

Neural Network Classifier and Filtering for EEG Detection in Brain-Computer Interface Device

Chapter 4 SPEECH ENHANCEMENT

Improvement of Classical Wavelet Network over ANN in Image Compression

Automatic Classification of Power Quality disturbances Using S-transform and MLP neural network

Multiple-Layer Networks. and. Backpropagation Algorithms

Partial Discharge Classification Using Novel Parameters and a Combined PCA and MLP Technique

COMPUTATION OF RADIATION EFFICIENCY FOR A RESONANT RECTANGULAR MICROSTRIP PATCH ANTENNA USING BACKPROPAGATION MULTILAYERED PERCEPTRONS

Objectives. Presentation Outline. Digital Modulation Revision

Segmentation of Fingerprint Images

Emergency Radio Identification by Supervised Learning based Automatic Modulation Recognition

ECE 476/ECE 501C/CS Wireless Communication Systems Winter Lecture 6: Fading

Modern spectral analysis of non-stationary signals in power electronics

Hand & Upper Body Based Hybrid Gesture Recognition

Transactions on Information and Communications Technologies vol 1, 1993 WIT Press, ISSN

ARTIFICIAL NEURAL NETWORK BASED CLASSIFICATION FOR MONOBLOCK CENTRIFUGAL PUMP USING WAVELET ANALYSIS

USING EMBEDDED PROCESSORS IN HARDWARE MODELS OF ARTIFICIAL NEURAL NETWORKS

Near-Optimal Low Complexity MLSE Equalization

An HARQ scheme with antenna switching for V-BLAST system

Detection and Estimation of Signals in Noise. Dr. Robert Schober Department of Electrical and Computer Engineering University of British Columbia

Live Hand Gesture Recognition using an Android Device

CLASSIFICATION OF CLOSED AND OPEN-SHELL (TURKISH) PISTACHIO NUTS USING DOUBLE TREE UN-DECIMATED WAVELET TRANSFORM

2 TD-MoM ANALYSIS OF SYMMETRIC WIRE DIPOLE

Keywords: - Gaussian Mixture model, Maximum likelihood estimator, Multiresolution analysis

MLP for Adaptive Postprocessing Block-Coded Images

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Radio Deep Learning Efforts Showcase Presentation

3GPP TR V6.0.0 ( )

Mobile Wireless Channel Dispersion State Model

Design of DFE Based MIMO Communication System for Mobile Moving with High Velocity

Noise Cancellation using Adaptive Filter Base On Neural Networks

THE EFFECTS OF NEIGHBORING BUILDINGS ON THE INDOOR WIRELESS CHANNEL AT 2.4 AND 5.8 GHz

AUTOMATED METHOD FOR STATISTIC PROCESSING OF AE TESTING DATA

Transcription:

Adaptive Multi-layer Neural Network Receiver Architectures for Pattern Classification of Respective Wavelet Images Pythagoras Karampiperis 1, and Nikos Manouselis 2 1 Dynamic Systems and Simulation Laboratory Dept. Production Engineering & Management Technical University of Crete pythk@dssl.tuc.gr 2 Advanced eservices for the Knowledge Society Research Unit Informatics and Telematics Institute (I.T.I.) Center for Research and Technology Hellas (CE.R.T.H.) nikosm@iti.gr Abstract. A difficult class of signal detection problems is detecting a nonstationary signal in a non-stationary environment with unknown statistics. One of the most interesting approaches considers the use of a neural network to compute the likelihood ratio of the received signal, by training it on different realizations of the received signal. The signal detection problem is then transformed to a pattern classification problem. It is still difficult though to determine the optimum internal structure of the neural network used, in order to achieve maximum performance of the receiver with less complexity of the network. In this paper, we demonstrate the use of self-organizing neural network. This network optimizes performance by re-configuring its internal structure regarding on whether the generalization results are satisfactory or not. The use of this network structure in the receiver architecture is also compared to a classic neural network approach of the signal detection problem. 1 Introduction The Time Division Multiple Access (TDMA) modulation used in the Global System for Mobile communications (GSM) network requires the transmission of a training sequence consisting of 26 bits every 116 information bits, which represents wastage of about 23% throughput. Single efforts have been made to avoid use of such training sets, with the use of fixed multi-layer neural network architectures. In this paper we propose a self-organized multi-layer architecture for the design of receivers for TDMA wireless communications. The new receiver architecture is based on the transformation of the detection problem into an adaptive pattern classification problem. This transformation enables a neural network to function as a powerful tool, which learns the underlying dynamics of a time-varying multi-path environment from data representative of that environment. This technique, originally proposed by Haykin et al. [1] can be altered in order

to achieve better performance than conventional Minimum Shift Keying (MSK) receivers for a Rayleigh fading multi-path channel, without the regular transmission of a training sequence. Neural networks architectures most referenced in pattern recognition literature [4] are three: the multi-layer perceptron, the Kohonen associative memory and the Carpenter-Grossberg ART network. These networks implement algorithms of the maor pattern classification paradigms: the multi-layer perceptron runs a supervised, parameter-learning algorithm the asymptotic behavior of which is that of an optimal Bayesian classifier; the Kohonen network performs vector quantization, mapping reference data onto a set of patterns representative of pattern category; the Carpenter- Grossberg network is motivated by biological relevance and brought to bear on computer-based pattern recognition, running an unsupervised algorithm that has similarities to leader clustering. We are mainly focusing on the architecture proposed in [1] and enhancing it, by proposing an evolutionary adaptive receiver, based on a self-organizing multi-layered neural network. This architecture is different from those found in the associated literature [1], [2], [4], as it is based on a self-organized network architecture [5]. This architecture proves to provide better generalization results in pattern classification problems, compared to similar adaptive architectures. 2 Simulation Parameters A simplified mobile communication system can be modeled with the use of a digital source, a modulator, the multi-path channel and the receiver under test. The receiver design basically involves three functional blocks: time-frequency analysis, data reduction or feature extraction and pattern recognition. The desire in simulating the signal waveform for testing is to model the important elements of a signal, without unnecessary complications due to a particular protocol. The digital modulation system under study is a form of Frequency Shift Keying (FSK) called Minimum Shift Keying (MSK). The simulation signal parameters are representative of those in the GSM standard (Gaussian shaped pre-filtered MSK). The channel used here is in accordance with the GSM channel model. In particular, the multi-path channel is characterized by a time-varying impulse response h(τ,t) given by θi h( τ, t) = β ( t) e δ[ τ τ ( t)]. 1 i where β i (t) and θ i (t) are time-varying amplitude and phase if the i th path arriving at delay τ i (t). Notice that β i (t), θ i (t), τ i (t) are in general random variables. However, in order to allow practical simulation, the path number is set to be finite in each of GSM channel models (rural area, hilly terrain and urban area) thereby allowing a tappeddelay line implementation. More specifically, the channel model consists of L taps (typically L=12), each of which is determined by a prescribed time delay τ i (t), average i (1)

power P i, and Rayleigh distributed amplitude varying according to a Doppler spectrum S(f). Throughout this paper, we will use the urban area GSM model parameters. The channel output is corrupted by additive noise that is assumed to be Gaussian, with zero-mean and variance σ 2. The received signal is then led into the receiver whose function is to detect the transmitted signal a k which is multi-path (frequency selective) faded and noise corrupted. Since the GSM channel model tap delays are multiples of 0.1 µs, a sample rate of 10MHz (100 ns sampling period) was used in the simulation. Taking 36 samples per bit yielded a bit period of 3.6 µs, thus a bit rate of about 278 KHz, representative of the GSM bit rate. The GSM system uses a training sequence to characterize the channel impulse response for a Viterbi receiver that considers a group of 5 consecutive bits at a time. Similarly, the receiver described in this paper operated on a sliding window block of 5 consecutive bits. 3 Signal Transformation A noisy received signal can be represented in such a way that the signal components belonging to different classes (e.g. symbol 1 or symbol 0) have as more distinct representations as possible. Transforming the one-dimensional received signal into a two-dimensional image with time and frequency as coordinates is the idea used in this simulation. Since we are considering an FSK signal, the information is conveyed by a change in instantaneous frequency with time, so it is natural to examine timefrequency analysis methods for this transformation. From the two methods that were studied, that is Wigner-Ville transformation and wavelet analysis, the latter was chosen since it produced better receiver performance. The wavelet used in the receiver is the Morlet wavelet, whose computation was carried out using the fast Fourier transform (FFT) algorithm. The squared amplitude of the Morlet wavelet, known as a scalogram, was used as the overall output of the time-frequency analyzer. Since the wavelet image is highly redundant and considerably large, it is necessary to compress the image so that the design of the pattern classifier can be eased. However, we must ensure that the significant features contained in the image are extracted. Although principal components analysis (PCA) is widely used as a tool for feature extraction, it is proved that it cannot be satisfactorily used for the task at hand, since it is not particularly sensitive in changes in the instantaneous frequency, which is a maor characteristic of the GMSK (Gauusian MSK) signal. Instead, a similar method is used, referred to as the energy profile. This method computes the energy values for a set of frequencies within the duration of one bit. Specifically, 5 scalogram values corresponding to scale bin 3 to 7, for each time index n, were used in computing the energy profile. Each bit s duration is divided into 4 segments, with each segment being associated with 9 samples. Then, with 5 bits, 4 time segments per bit and 5 frequency bins per bit, we have a total of 5x4x5=100 energy values, with each one being the result of adding 9 pertinent scalogram values. The motivation behind the use of multiple scalograms in computing the energy profile is to exploit the contextual information contained in a corresponding number of adacent data bits.

4 The Network Receiver Architecture The purpose of pattern classification is to recognize binary symbols 1 ad 0 by classifying the patterns in the respective wavelet images. In previous approaches [1], neural networks were used, consisting of multi-layer perceptrons trained with the backpropagation algorithm. Most of these approaches used static combinations of multilayer perceptrons. In order to improve the generalization capability of the pattern classifier, we propose a self-organizing, multi-layer neural network, capable of adapting to the nature of the problem. Reformulation of the signal detection problem as an adaptive pattern classification problem provides improved detection of a non-stationary target signal embedded in a non-stationary background. Pattern classification deals with assigning an unknown input pattern using supervised learning to one of several pre-specified classes, based on one or more properties that characterize the given class, as they were defined in the previous paragraph. 4.1 Network structure The neural network used is a growing multi-layered perceptron, which begins from a basic structure of one node and one hidden layer and is then self-altering until it reaches the optimum structure for the given problem. The self-organization process consists of two phases: a growing one and a shrinking one (Figure 1). In the growing phase of self-optimization, two basic principles must always be valid: - Every hidden layer has the same number of hidden nodes as the rest of the hidden layers. - There are two ways of growing: horizontal (by incrementing the number of hidden layers) or vertical (by incrementing the number of hidden nodes). The growth rule of the network is the optimization of the generalization error. At every step of the algorithm of growth, the following potential steps must be examined: 1. Calculate generalization error, using the current structure. 2. Calculate generalization error, after horizontal growth. 3. Calculate generalization error, after vertical growth. Then the following conditions are examined: - If horizontal growth is proven better than the vertical growth and the current structure, grow horizontally and return to the beginning. - If vertical growth is proven better than the horizontal growth and the current structure, grow vertically and return to the beginning. - If the current structure is proven better than the vertical growth and the horizontal growth, optimization stops.

Starting Network Best Symentric Network Final Best Network Fig. 1. A 2x4 network is first growing and then pruning in order to end up to the best possible architecture. This is the growing algorithm that starts from the simple one node, one hidden layer perceptron and ends in a MLP network that gives optimum generalization. After the network structure is chosen, then pruning techniques are used in order to deduct certain nodes from this structure, with minimization of the generalization error. In our application we have used Optimal Brain Damage (OBD) as a pruning technique, but it is obvious that the growing phase of the algorithm does not depend on the chosen pruning algorithm. 4.2 Training algorithm The network is consisted of neurons, which have an activation function of the form, and locating the values of the elements of the network requires Φ ( U ) = a tan( bu ) employing the back-propagation algorithm. The feed-forward error back-propagation (BP) learning algorithm is the most famous procedure for training artificial neural networks (ANNs). BP is based on searching an error surface (error as a function of ANN weights) using gradient descent for point(s) with minimum error. Each iteration in BP constitutes two sweeps: forward activation to produce a solution, and a backward propagation of the computed error to modify the weights. There has been much research on improving BP s performance. The Extended Delta-Bar-Delta (EDBD) variation of BP attempts to escape local minima by automatically adusting step sizes and momentum rates with the use of momentum constants. To reduce the possibility of trapping into a local minimum even more, we use an extension of EDBD, which assumes that every node has a different activation function and every synaptic weight has its own learning rate. So we consider the following quantities as free parameters in each neuron: - w: weight of every synaptic connection, - a, b: activation function parameters, - r wi : learning rate of w i, - r a : learning rate of a,

- r b : learning rate of b. X 0 =1 w 0 X 1 w 1...... Σ U Φ (U) Y X i w i...... X m w m Fig. 2. Model for Neuron In order to avoid trapping into a local minimum, we adopted a momentum constant equal with (1-r x ) for every x free parameter of the network. The corrections of free parameters for epoch n become so: w δ. = r + ( 1 ) wi Xi rw w i i( n 1) i (2) Y a = r + (1 ) a δ r ' a a. ( n 1) a Φ ( U ) (3) U b δ. = r + ( 1 ) b rb b ( n 1) b ( n) (4) w i a b where r w i is the learning rate parameter of and and and ( 0 rw i < 1) weight(, i) ( < ra < 1) node ( 0 < r < 1) node. <, is the learning rate parameter of r a 0 and is the learning rate parameter of b r b Backpropagation is applied on the training set, with cost function the average square error, which must be minimized. As a training stopping criterion we use the generalization error. The average squared error (for minimization) can be calculated from: where: E av 1 = 2TM T M n= 1 C k = 1 ( D Y ) ( n ) ( n ) 2. - T is the total number of examples in the training set - C is the set of all the neurons in the output layer (5)

- M is the number of outputs. Parameter values and learning rates for Node[0,0] of Best Network (Eb/No=15) 0,9 0,8 0,7 0,6 0,5 0,4 B value B learn W value W learn 0,3 0,2 0,1 0 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 Network Iterations Fig. 3. Parameters and learning rates for the proposed structure, Node [0,0]. Every adustable network parameter (free) of the cost function has its own learning rate parameter, given by: R x( n) = x( n). r Rule 1. ( R > ) and ( R 0) then increase rx ( r x = r x + 0.001 ) ( n+ 1) 0 > ( n+ 1) < 0 ) and ( R( n) < Rule 2. ( R 0) then decrease r x ( r x = r x - 0.001 ) As an initial value for every r x we define 0.5. This is a value that can be changed as epochs pass, for each adustable parameter. In Figure 3, some network parameters of a specific node and their corresponding learning rates are presented. x( n ) (6) 4.3 Initial data processing Input data should be initially processed, in three stages (Figure 4): 1. Mean removal: mean values for each input node is removed, in order to centralize the original data values. 2. Decorrelation: the training set input data should be uncorrelated, so we use Principal Components Analysis at this stage. 3. Scaling: normalization is carried out in order to make covariances of the decorrelated input variables approximately equal.

Fig. 4. Input data at the initial conditions and the three following stages of Mean Removal, Decorrelation and Scaling. Splitting the training data set into estimation and validation subsets is originally such that 70% corresponds to training data and 30% to validation data. Since the growth algorithm is executed, an optimal network structure is chosen, where each hidden layer contains the same number of nodes with all the others. It is possible so to calculate the number W of the free parameters and re-define the validation subset according to the following formula: 2W 1 1 V = [ 1 ] T f. 2( W 1) where V is the validation set and T f the full set of the training data. It is only after this split that we apply the pruning techniques, in order to increase generalization in this new validation set. (7) 4.4 Initialization The synaptic weights w i for neuron are drawn from a uniformly distributed set of 1 numbers with mean: µ w = 0 and variance: σ w =, for all (,i) pairs, where m m is the number of synaptic connections of neuron. Other initial values are learning rate: = 0.5, for every adustable network parameter x, parameter a: a r x = 1.7159 and parameter b: b =, for every node. 2 3

Performance for several networks 0,09 0,08 0,07 Bit Error Rate (Eb/No=15) 0,06 0,05 0,04 0,03 0,02 0,01 0 1 2 3 4 5 Network Hidden Layers Fig. 5. The bit-error rate for several network structures with different number of hidden layers. Testing Error of Best Network 120 100 Bit Error Rate (Eb/No=15) (x0.5e-2 80 60 40 20 0 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 Networ k Iter ations Fig. 6. Transforming the initial network to the best network found throughout several iterations. Performance of KaraNetwork in PMSK Receivers 0,18 0,16 0,14 0,12 Bit Error Rate 0,1 0,08 PMSK Nnet Haykin 0,06 0,04 0,02 0 10 15 20 25 30 35 Eb/No Fig. 7. Proposed Network (Nnet), Haykin et al. (Haykin) and PMSK receiver performance.

5 Results We evaluated the architecture based on the proposed structure for the receiver, compared to the original neural network receiver structure tested by Haykin et al. in [1]. In Figure 5, the performance for several different cases of hidden-layers is shown, measured in bit-error rate. Figure 6 depicts the bit-error rate for the best network, according to the number of iterations of the training process of the specific network. In Figure 7 the bit-error rate for different values of Eb/No is presented, as far the Haykin et al. network and the proposed structure are concerned, for the classic PMSK receiver. It is clear that the use of the proposed network architecture enhances the receiver structure originally proposed by Haykin et al. [1] by substantially improving its performance. Moreover, it is clear in Figure 7 that when the value of Eb/No exceeds the value 25, performance can be compared even to that of the conventional PMSK receiver. 6 Conclusions The transformation of a signal detection problem to a pattern classification problem is a technique found often in the literature, also providing very good results. In this paper we studied the neural network based receiver structure proposed by Haykin et al. and then substituted the classic MLP architecture with a specially designed adaptive architecture [5]. Results have proven that this self-organizing architecture greatly improves performance in such cases of pattern classification problems, and especially in the case of classification of respective wavelet images. Future research concerns dealing other pattern classification problems with the proposed architecture and extending it to other fields of practice that neural networks are used, as time-series prediction. References 1. S. Haykin, J. Nie, B. Currie, Neural Network-based Receiver for Wireless Communications, Electronic Letters, Vol. 35, Issue 3, February 1999, pp. 203-205. 2. S. Haykin, D.J. Thomson, Signal Detection in a Nonstationary Environment Reformulated as an Adaptive Pattern Classification Problem, Proceedings of the IEEE, Vol. 86, Issue 11, November 1998, pp. 2325-2344. 3. D.T. Pham, S. Sagiroglu, Training multilayered perceptrons for pattern recognition: a comparative study of four training algorithms, International Journal of Machine Tools & Manufacture 41, 2001. 4. A. Mitiche, M. Lebidoff, Pattern classification by a condensed neural network, Neural Networks, Vol. 14, 2001, pp. 575-580. 5. P. Karampiperis, N. Manouselis, T. Trafalis, Architecture selection for neural networks, to appear in Proc. of IEEE World Congress on Computational Intelligence, Hawaii, May 2002.