Automatic Digital Modulation Classification Using Genetic Programming with K-Nearest Neighbor

The 21 Military Communications Conference - Unclassified Program - Waveforms and Signal Processing Track Automatic Digital Modulation Classification Using Genetic Programming with K-Nearest Neighbor Muhammad Waqar Aslam, Zhechen Zhu and Asoke K. Nandi Department of Electrical Engineering & Electronics The University of Liverpool Brownlow Hill, Liverpool, L69 3GJ, U.K. {m.w.aslam, z.zhu, a.nandi}@liverpool.ac.uk Abstract Automatic modulation classification is an intrinsically interesting problem with various civil and military applications. A generalized digital modulation classification algorithm has been developed and presented in this paper. The proposed algorithm uses Genetic Programming (GP) with K-Nearest Neighbor (K-NN). The algorithm is used to identify BPSK, QPSK, and modulations. Higher order cumulants have been used as input features for the algorithm. A two-stage classification approach has been used to improve the classification accuracy. The high performance of the method is demonstrated using computer simulations and in comparisons with existing methods. Keywords - Automatic digital modulation classification, Genetic Programming, K-nearest neighbor, Higher order cumulants, Software defined radio, Detection and estimation I. INTRODUCTION Automatic modulation classification (AMC) is an intermediate step between signal detection and demodulation. It is a rapidly evolving area in various digital signalling systems currently developed or planned for various civilian and military communication applications [1]. With the recent developments in software-defined radio (SDR) [2] [4] automatic modulation classification has gained more attention than ever. Software implementation of any hardware is always more flexible as compared to hardware, dedicated for a specific task. The idea of software radio is to have a transceiver which can dynamically adapt to communications channel and user applications. Automatic modulation classifier can be used at front end of SDR allowing it to handle multiple modulation types. Another emerging technology for dynamic spectrum access is cognitive radio [5]. The idea in cognitive radio is to allow a secondary user to reuse or share a radio spectrum originally allocated to primary user if it is under utilized by the primary user. But in order to avoid interference with the primary user, the secondary user should have accurate knowledge of signal used by primary user. For this reason the performance of cognitive radio can be greatly enhanced by using a reliable modulation classification scheme [5] [7]. A comprehensive survey of different modulation classification techniques employed in the literature has been presented in [8]. The first two steps of the demodulation process are; signal preprocessing and modulation classification. Regarding the second step two types of approaches have been used; the maximum likelihood (ML) based approach [9] [11] and the pattern recognition (PR) approach [12] [21]. The former treats AMC as a hypothesis testing problem and the decision is based on likelihood ratio tests. The fundamental assumption made in this approach is that conditional probability density functions of the received signals are available. It presents an attractive solution in terms of minimizing the probability of false classification but suffers from computational complexity. The associated computational complexity reduces the applicability of this approach to only those cases where the number of modulation schemes is limited. Hong and Ho [9] used Bayes method to classify BPSK and QPSK without a priori information about the received signal level. Wong and Nandi [1] presented the idea of minimum distance classifier to reduce the complexity of ML classifier. They also used blind source separation algorithm for rectifying the carrier phase offset problem. Wei and Mendel [11] also used ML approach for classification of digital amplitude phase modulations. In the PR approach the modulation classification system is composed of two subsystems; the first one is a feature extraction subsystem, which extracts some key features from the received signal; the second subsystem is a pattern recogniser, which processes those features and determines the modulation type of received signal. Swami and Sadler [12] proposed a method for digital modulation classification using fourth order cumulants. Their method was robust to the presence of carrier phase and frequency offsets. Xi and Wu [13] used higher order statistics for blind channel estimation and pattern recognition. They presented results of modulation classification in the presence of fading channel. Headley, Reed and Silva [14] used a two stage process for AMC. In the first stage, local radios make a decision which is then sent to a fusion centre which makes a global decision about the modulation type. Wong and Nandi [17] used artificial neural networks (ANN) and genetic algorithms for recognition of various digital modulation types. They used two types of ANNs; multi layer perceptron (MLP) and resilient backpropagation (RPROP). They also used Bayes method with higher order cumulants for classification of BPSK, QPSK, and [18]. The results were benchmarked with ML and support vector machine (SVM) classifier. Dobre, Ness and Su [21] presented an algorithm based on higher order cyclic cumulants for automatic recognition of QAM signals. In this paper we have proposed new machine learning based method using genetic programming (GP) for the classification of BPSK, QPSK, and. Machine learning has Muhammad Waqar Aslam thanks University of Azad Jammu & Kashmir Pakistan for the financial support. 978-1-4244-8179-8/1/$26. 21 IEEE 512

been used in the past for AMC but GP has never been tried for this problem. Wong and Nandi [17] used neural network and genetic algorithm for AMC. Zhang, Ciesielski and Andreae [29] used GP for multi class object detection. Zhang, Jack and Nandi [31] used GP and KNN for multiclass classification problem and we have adopted part of this approach in this research. Zhang and Nandi [32] also proposed a method using GP for fault classification in machines. They proposed the idea of bundled-gp for feature selection. In this study we have used a combination of GP and K-Nearest Neighbor (KNN). Fitness evaluation of individuals in GP has been done using KNN. A two stage classification approach has been used here. A tree is generated at the first stage for classifying four modulation types into three groups and then another tree is generated for classification of remaining modulations. GPLab toolbox has been used here for all the training and test experiments (GPLab can be found in the web gplab.sourceforge.net). example of crossover operation is shown in Fig. 1. Mutation on the other hand, takes only one parent tree and one branch from it. Instead of using an existing branch from another tree, mutation replaces the original branch with a randomly generated new branch. With such a reproduction approach, the new trees are expected to inherit some of the useful traits from the last generation and manage to achieve better fitness with gradual partial changes. At the end of the evolution, the best tree ever produced in the whole evolution process is selected as a final solution. The paper is organized as follows. Section II gives the signal model of received signal. Basic structure of GP and the complete model of the classifier used have been explained in Section III. Details of experimental results and simulations are given in Section IV. Section V discusses and compares our results with other results present in the literature and the conclusions are drawn in Section VI. II. SIGNAL MODEL For any modulated signal that is transmitted, the baseband waveform at the receiver can be written as y(n)=a (1) where y(n) is complex baseband envelope of the received signal, is input symbol sequence, A is unknown amplitude, is constant carrier frequency offset, is the symbol spacing, is the phase jitter, h (.) is residual baseband channel effects, is the timing error, is additive white and Gaussian noise (AWGN). III. METHOD A. Genetic Programming Genetic Programming is a machine learning methodology which is inspired by biological evolution on the development of computer programs [22]. In most classification applications, these programs are actually expressions represented by tree structures. The tree structure, as shown in Fig. 1, has a single output computed based on the arrangement of inputs at different terminals and functions at different nodes. This single output is used for classification in different ways according to the classification strategy. The entire GP process starts with a randomly generated initial generation. The evolution of these tree structures is essentially directed by the fitness evaluation which evaluates the performance of a tree on solving a specific problem. During reproduction, trees with better fitness are selected for different operations including crossover and mutation to generate new trees. Crossover takes two parent trees and selects one branch from each then swaps the selected branches to make two new children trees. An Fig. 1. Crossover operation with two parent tress and two children Trees. GP has been applied in some classification problems [23] [26]. In [23] a survey is given on the application of genetic programming for classification purpose. Previous study [27] by Loveard and Ciesielski has suggested that GP has the advantage in classification that the more time given for evolution the better performance can be achieved and for different runs, due to the randomness in parent selection and reproduction operation, different final solutions could be generated which lends GP classifiers well to a voting strategy to improve the performance. In this paper, GP is used as a feature generator for the K- Nearest Neighbor classifier. During the program evolution, different tree structures are formed with different combination of existing high order cumulant features to output a single value to be used as the new feature. While one can simply use the existing features for KNN classification, a combination of selected features should surely return a better performance. GP in this case automates this complex process in an intelligent way. Once GP returns the best combination of features in the form of a tree, the tree is tested with KNN classifier. Any other classifier could also be used for testing of the tree but KNN has been chosen here because of its simplicity. Also as KNN has been used in fitness evaluation as well so it can be used again for testing of tree without complicating the process. B. Feature extraction The pattern recognition approach has been used for AMC in this paper. Fourth and sixth order cumulants of received signals have been used as the underlying features before. 513

Cumulants are made up of moments of received signals, so various moments have been used as features too. For a complex valued stationary signal the cumulants can be defined as shown below [19]!= cum(y(n),y(n),y(n),y(n)) = "!#$"! %= cum(y(n),y(n),y(n),& ' (n))=#"!#$"! " % = cum(y(n),y(n),#& ' (n),& ' (n))=#" #("! ( )" % *! = cum(y(n),y(n),y(n),y(n),y(n),y(n)) =#" *! #+,"! #"!#$-"!. *% = cum(y(n),y(n),y(n),y(n),y(n),#& ' (n)) =#" *% #," % "!+-"! " %$-"! " % * = cum(y(n),y(n),y(n),y(n),#& ' (n),#& ' (n)) = " * #/"! " " % " %" "! #########/"! " )1" % "! *. = cum(y(n),y(n),y(n),#& ' (n),#& ' (n),#& ' (n)) = " *. #2" % " +)" %. $"! ". #########$" " %+"! " % " (2) " 34 represents the moment of a signal which is defined as " 34 = E [&5 34 & ' 5 4 ] (3) C. Two-stage Genetic Programming In [12] Swami and Sadler used similar high order cumulants features to classify BPSK, QPSK and QAM (> 4) and produced good performance. However, their classification methods do not distinguish between and well. This indicates the difficulty in separating 16 QAM from. In another study [18], Wong and Nandi used similar features for classification between BPSK, QPSK, and. The results also suggest that BPSK, QPSK and QAM (> 4) are easier to be classified, while classification between and was not very successful. Due to the nature of this difficult problem, if GP was used in a similar way to handle the 4-class classification, it would still suffer from the same problem in discriminating from. Therefore, in order to achieve the optimum performance for all classes, in this paper the GP classification process has been divided into two stages. In the first stage, signals are classified into 3 classes: BPSK, QPSK and QAM (include both and ). The third class output is then fed into the second stage to be further classified into and. In this way, GP is allowed to use different features and to generate different trees in both stages, and therefore it can better solve the problem, especially for discriminating between and. An illustration of the whole process is shown in Fig. 2. D. Multi-class classification with K-Nearest Neighbor Genetic Programming is easy to be implemented for binary classification because of its single output structure. However, as this problem involves four different modulation types, a multi-class classifier is needed. One method, experimented by Kishore et al. [28] and Muni et al. [29], divides the n-class classification into n 2-class problems to realize multi-class classification with GP. This method inherited the simplicity for such 2-class GP classifiers. However, as the whole process involves n number of these 2-class classifiers, the total resources required is n times the one needed for each 2-class classifier. As the number of classes grows, the complexity of the classifier increases as well. This problem makes this classifier suitable, only for small number of classes. Fig. 2. Block diagram of proposed system. Another method to use GP for multi-class classification was developed by Zhang et al. in [3]. They did it by setting multiple thresholds between different classes and use GP to generate outputs for different classes to fit in spaces between the thresholds. When applying this method, as the thresholds are problem-dependent, users need to set the thresholds manually for every problem. Before using GP to optimize the output used for classification, if one wants to achieve better performance from the GP classifier, a large number of tests are needed to estimate the optimum order of the thresholds and the spacing between different thresholds. This is a time consuming process and it is still difficult to achieve the optimum performance. In this paper, a fully automated GP classification scheme is developed, based on the method developed by Zhang et al. in [31]. It uses K-Nearest neighbor (KNN) as a simple classification scheme in GP. Compared to the other methods mentioned above, GP with KNN has the advantage that its complexity is not affected much by the number of classes involved and the optimization process is fully automated as there is no need to estimate manually and set the thresholds for different classes. This method uses GP to generate a single new feature with features from inputs. This new feature is then used in the KNN classifier for assisting the fitness evaluation and conducting the classification job. When constructing the KNN classifier, the new features for the number of 6 samples from each class are first calculated and used as reference points. These points are distributed in a feature space. In this GP setting, as the input features are complex values, with selected functions, GP is able to generate the feature in complex value. Therefore feature space is realized as a two dimensional plane with the real part of the new feature as one dimension and the imaginary part as the other dimension. Examples of reference feature space can be found in Fig. 3 and Fig. 4. 514

To classify an unseen test sample, the 7 nearest neighboring points are first ascertained, with the distance 8 9: is defined by, 8 ;< = ; < > ; < ' (4) where 9 is the new feature for the input test sample and : is the new feature for the reference sample. Among these? nearest reference points, the class with the largest number of points is returned as the classification results. The performance of this KNN classifier indicates the quality of this new feature and the fitness of the individual who produced it. + BPSK QPSK Fig. 3. Feature distribution for first stage 3-class classification. The data is from BPSK, QPSK, and each obtained from signals with 512 samples and under 1 db SNR. input. Instead, the output is used as a new feature for the KNN classifier with some of the training data used as reference samples and the remaining training data used for evaluating trees. The classification results from KNN classifier are obtained as described earlier in this section. Once the classification is finished, the result is returned to the fitness calculation function to be checked with the correct class information. The number of correct classifications and incorrect classifications are calculated for the fitness calculation. The fitness # is given by = A% @ A A (5) where is the number of classes to be classified, and A is the number of classification errors for class i. Because this is a multi-class classification, errors from different classes are recorded separately and can be assigned with different penalty weight#@ A. By setting different#@ A, the program can adjust its classification performance for different classes. The larger is the penalty given to a class, more biased the evolution will be to correctly classify this class. Ultimately, the individuals with smaller # values, which indicate better classification performance and better fitness, will have an increased chance of joining the evolution of the next generation via different operations. F. GP parameters For the GP programs used in this paper, 1 generations with 25 individuals in each generation were generated. However, the program can be terminated if any of the individuals reached perfect fitness. The operators used are crossover with 9% probability and mutation with 1% probability. While training the first classifier, the error classification penalty is set as 1. for every class, and and are treated as the same class. In the second classifier, the penalties for BPSK and QPSK are set as, so that they are not involved in the fitness evaluation and the classifier evolution could continue exclusively for the classification of and. Functions used and some other parameters are briefly listed in Table 1. Different input training data, with signals under different SNRs, has been used in different runs. Ultimately, the best tree set from all runs was selected as a general classifier for all channels conditions, while, to achieve the optimum performance, reference signal samples must still be from a channel condition specified by the unseen signals which are to be classified. Fig. 4. Feature distribution for second stage 2-class classification. The data is from and each obtained from signals with 512 samples and under 1 db SNR. E. Fitness evaluation Different from conventional fitness evaluation, the tree output from each individual is not directly utilized for fitness calculation having employed the target value from the training IV. EXPERIMENTS AND RESULTS The parameters used for the experiments have been given in Table 1. The population size used for all the experiments is 25. The number of training experiments done was also 25 so the trees generated were 625. Out of these 625 trees, best tree was picked for evaluation using the test data. This best tree was tested for different SNRs (4 db, 1 db, 12 db, 2 db) and different number of samples (512, 124 and 248). So all the testing has been done on a single tree. 515

MATLAB/GPLab has been used for all the training experiments and tests have also been carried out in MATLAB/GPLab environment. For each value of SNR and the number of samples, 1, realizations of test data have been produced. These 1, realizations have been tested with the best tree and the results have been summarised in Table II. Parameter TABLE I GP PROGRAM PARAMETERS Number of Generation 1 Population Size 25 Function Pool Terminal Pool Genetic Operator Standard Value {plus, minus, times, reciprocal, negator, abs,sqrt, sin, cos, tan, asin,acos, tanh, mylog*} Moments and Cumulants Operator Probability {.9,.1} Tree Generation Initial Maximum Depth 28 Selection Operator {crossover, mutation} ramped half-and-half Lexictour Elitism replace *mylog is a protected B C function which ignores the input if it is zero The performance of GP classifier at 248 number of samples and at different SNRs is given in Table III in the form of confusion matrix. One can see that for BPSK and QPSK it shows 1% accuracy except at 4 db while the results for and are also promising. It is also clear from the results that classification of BPSK and QPSK is relatively easy as compared to and and for this reason the classification process was divided into two stages. TABLE II ACCURACY OF PERFORMANCE RESULTS OF GP TREES WITH 1 STANDARD DEVIATION IN 4-CLASS CLASSIFICATION FOR DIFFERENT SNRS AND NUMBER OF SAMPLES SNRs 512 124 248 4 db 74.2 ±.5% 8.7 ±.6% 89.4 ±.4% 12 db 95.3 ±.4% 98.7 ±.2% 99.9 ±.1% 2 db 97. ±.2% 99.6 ±.1% 1. ±.% V. DISCUSSIONS In [18] Wong and Nandi presented results for the same four modulations using Naïve Bayes, SVM and ML classifier. They reported an accuracy of 9.2%, 94.4% and 97.79% at 512, 124 and 248 number of samples respectively using Naïve Bayes classifier. For the same settings of number of samples the performance achieved through SVM was 91.2%, 94.8% and 97.9% respectively while the performance for ML classifier was 75%. The results have been summarised in Table IV and it is clear from these results that our classifier outperforms all other classifiers at all these settings. Also the low standard deviation shows the robustness of our method. It is important to note that these performance improvements are significant. For example, with 124 samples GP-KNN offers an average performance improvement of 3.4% which is 39% of the ideal improvement. TABLE III PERFORMANCE OF GP TREE IN 4-CLASS CLASSIFICATION AT DIFFERENT SNRS AND AT 248 SAMPLES SNR Modulation Type BPSK QPSK 4dB BPSK QPSK 1 9699 12 8118 262 31 1882 7926 12dB 2dB BPSK QPSK BPSK QPSK 1 1 1 1 9971 13 9999 3 29 9987 1 9997 Swami and Sadler [12] presented results for classification of and using only!. Their achieved performance was 9% but there were two problems with this method. Firstly the conditions were assumed to be noise-free and also the number of samples available was more than 1,. Our method achieved 1% performance at a comparatively difficult SNR of 2 db and using only 248 samples. Dobre, Ness and Su [21] reported accuracy of about 7% using 2 samples and at 1 db SNR for classification of and. For the same conditions we achieved an accuracy of 99.8%. The same authors reported 75% classification accuracy between 4QAM and using 5 samples at an SNR of 1 db. We achieved 1% performance under similar conditions. TABLE IV ACCURACY OF PERFORMANCE RESULTS IN 4-CLASS CLASSIFICATION AT AN SNR OF 1 db No. of samples Naïve Bayes [16] SVM [16] ML [16] GP-KNN (This paper) 512 9.2% 91.2% 75.% 93.8±.3% 124 94.4% 94.8% 75.% 97.8±.2% 248 97.9% 97.9% 75.% 99.6±.1% Xi and Wu [13] achieved 94% accuracy for classification between 4QAM, and. They used 2 samples and SNR was 1 db. Our achieved accuracy is 99.4% under the same scenario. Mirarab and Sobhani [19] used higher order cumulants for classification of large set of modulation types and this set also included the four modulations considered here for classification. They reported performance of 7% at an SNR of 1 db and 5 samples for their large modulation set. We have achieved 93.8% under same conditions for our relatively smaller modulation set. 516

VI. CONCLUSIONS The challenge for all the existing AMC algorithms is to find the best possible features and then find the best relationship among those features. GP has inherent capability to select good features and ignore useless features. In this paper GP with KNN has been used to reduce the input dimensions by finding the best features. It then tries different relationships among those feature and finds the best possible relationship in the given conditions. The method has been used for discriminating BPSK/QPSK// signals. The method has been divided in two stages and the results show that this two-stage scheme is very useful, especially for classification of and. The results demonstrate that GP with KNN has been able to produce better performance than all other published methods. REFERENCES [1] E. E. Azzouz and A.K. Nandi, Automatic Modulation Recognition of Communication Signals. Boston, MA: Kluwer, 1996. [2] K. E. Nolan, L. Doyle, P. Mackenzie, and D. O Mahony, Modulation scheme classification for 4G software radio wireless networks, in Proc. IASTED International Conference on Signal Processing, Pattern Recognition,and Applications, pp. 25-31, 22. [3] K. E. Nolan, L. Doyle and D. O Mahony, Signal space based adaptive modulation for software radio, in Proc. IEEE WCNC, vol. 1, pp. 51-515, 22. [4] N. Alyaoui, H. Ben Hnia, A Kachouri, M. Samet, "The modulation recognition approaches for software radio," 2nd International Conference on Signals, Circuits and Systems (SCS), pp. 1-5, 28. [5] B. Ramkumar, "Automatic modulation classification for cognitive radios using cyclic feature detection," IEEE Circuits and Systems Magazine, vol. 9, pp. 27-45, 29. [6] W. Zhiqiang, E. Like, V. Chakravarthy, "Reliable Modulation Classification at Low SNR Using Spectral Correlation," Consumer Communications and Networking Conference (CCNC), pp.1134-1138, 27. [7] B. Ramkumar, T. Bose, M. S. Radenkovic, "Combined Blind Equalization and Automatic Modulation Classification for Cognitive Radios," IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop (DSP/SPE), pp. 172-177, 29. [8] Octave A. Dobre, A. Abdi, Y. Bar-Ness, and W. Su, A survey of automatic modulation classification techniques: classical approaches and new trends, IET Commun., vol. 1, pp. 137-156, 27. [9] L. Hong and K. C. Ho, Classification of BPSK and QPSK signals with unknown signal level using the Bayes technique, in Proc. IEEE ISCAS, vol. 4, pp. IV.1-IV.4, 23. [1] M. L. D. Wong and A. K. Nandi, "Semi-blind algorithms for automatic classification of digital modulation types, Digital Signal Processing, vol. 18, pp. 29-227, 28. [11] W. Wei and J. M. Mendel, Maximum-likelihood classification for digital amplitude-phase modulations, IEEE Trans. Commun., vol. 48, pp. 189-193, 2. [12] A. Swami and B. M. Sadler, Hierarchical digital modulation classification using cumulants, IEEE Trans. Commun., vol. 48, pp. 416-429, 2. [13] Songnan Xi and H. C. Wu, Robust automatic modulation classification using cumulant features in the presence of fading channels, in Proc. of the IEEE Wireless Comm. and Networking Conf. (WCNC), vol. 4, pp. 294-299, 26. [14] W. C. Headley, J. D. Reed, and C. R. C. M. da Silva, Distributed cyclic spectrum feature-based modulation classification, in Proc. IEEE Wireless Comm. and Netw. Conf., pp. 12-124, 28. [15] H. Wijanto, Sugihartono, S. Tjondronegoro and Kuspriyanto, The Performance Improvement of Automatic Modulation Recognition Using Simple Feature Manipulation, Analysis of the HOS, and Voted Decision, World Academy of Science, Engineering and Technology, pp. 763-768, 29. [16] L. Hong, "Low-complexity identifier for M-ary QAM signals," SOUTHEASTCON '9. IEEE, pp. 164-168, 29. [17] M. L. D. Wong and A. K. Nandi, Automatic digital modulation recognition using artificial neural network and genetic algorithm, Sig. Proc., vol. 84, pp. 351-365, 24. [18] M. L. D. Wong and A. K. Nandi, Naïve Bayes classification of adaptive broadband wireless modulation types with higher order cumulants, in Proc. of the International Conference on Signal Processing and Communication Systems, 28. [19] M. R. Mirarab, M. A. Sobhani, Robust modulation classification for PSK/QAM/ASK using higher order cumulants, in Proc. of the Sixth International Conference on Information, Communications and Signal Processing, 27. [2] O. A. Dobre, Y. Bar-Ness, and W. Su, Higher-order cyclic cumulants for high order modulation classification, in Proc. IEEE MILCOM, pp. 112-117, 23. [21] O. A. Dobre, Y. Bar-Ness, and W. Su, Robust QAM modulation classification algorithm based on cyclic cumulants, in Proc. IEEE WCNC 24, vol. 2, pp. 745-748, 24. [22] J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992. [23] P. G. Espejo, S. Ventura, F. Herrera, "A Survey on the Application of Genetic Programming to Classification," IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 4, pp. 121-144, 21. [24] J. Eggermont, A. E. Eiben and J. I. van Hemert, A comparison of genetic programming variants for data classification, In Proc. on the Third Symposium on Intelligent Data Analysis (IDA), 1999. [25] C. Gathercole and P. Ross. Dynamic training subset selection for supervised learning in genetic programming, In Parallel Problem Solving from Nature III, pp. 312-321, 1994. [26] H. Gray, Genetic programming for classification of medical data, In Late Breaking Papers at the Genetic Programming Conference, pp. 291-297, 1997. [27] T. Loveard and V. Ciesielski, Representing classification problems in genetic programming, In Proc. of the Congress on Evolutionary Computation, vol. 2, pp. 17-177, 21. [28] J. K. Kishore, L. M. Patnaik, V. Mani, and V. K. Agrawal, Application of genetic programming for multicategory pattern classification, IEEE Transactions on Evolutionary Computation, vol. 4, pp. 242-258, 2. [29] D. P. Muni, N. R. Pal, and J. Das, A novel approach to design classifiers using genetic programming, IEEE Transactions on Evolutionary Computation, vol. 8, pp. 183-196, 24. [3] M. Zhang, V. B. Ciesielski and P. Andreae, A domain independent window approach to multiclass object detection using genetic programming, EURASIP Journal on Applied Signal Processing8, pp. 841-859, 23. [31] L. Zhang, L. B. Jack and A. K. Nandi, Extending genetic programming for multi-class classification by combining k-nearest neighbor, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 349-352, 25. [32] L. Zhang and A. K. Nandi, Fault classification using genetic programming, Mechanical Systems and Signal Processing, vol. 21, pp. 1273-1284, 27. [33] A. K. Nandi and E. E. Azzouz, Algorithms for automatic modulation recognition of communication signals, IEEE Transactions on Communications, vol. 46, pp. 431-436, 1998. [34] H. C. Wu, M. Saquib, Y. Zhifeng, "Novel Automatic Modulation Classification Using Cumulant Features for Communications via Multipath Channels," IEEE Transactions on Wireless Communications, vol. 7, pp. 398-315, 28. 517