Color Feature Extraction of Oil Palm Fresh Fruit Bunch Image for Ripeness Classification NORASYIKIN FADILAH Universiti Sains Malaysia School of Electrical & Electronic Eng. 14300 Nibong Tebal, Pulau Pinang MALAYSIA ikinfad@gmail.com JUNITA MOHAMAD-SALEH Universiti Sains Malaysia School of Electrical & Electronic Eng. 14300 Nibong Tebal, Pulau Pinang MALAYSIA jms@usm.my Abstract: The color of oil palm fresh fruit bunch (FFB) has been used as a ripeness indicator in the oil palm sector. This parameter is assessed manually by human vision, which makes oil palm FFB grading subjective and can lead to misjudgment. Thus, this paper presents the development of automated classification system of oil palm FFB using artificial neural network (ANN), focusing on the comparison of two feature extraction techniques that could improve the classification performance. This system used image processing techniques to extract the color features of the oil palm FFB and artificial neural network to classify the oil palm FFB into the following four ripeness categories: unripe, underripe, ripe and overripe. Principal component analysis and stepwise discriminant analysis techniques were used to reduce the color features and the reduced features were fed to ANN for classification. The result showed that reducing the color features using stepwise discriminant analysis improved the performance of classification accuracy by more than 10%. Key Words: feature extraction, artificial neural network, principal component analysis, stepwise discriminant analysis, oil palm fresh fruit bunch 1 Introduction Many of the automated systems for assessment of agricultural products employ computer vision in replacement of human vision to make the quality assessment task faster and more accurate [1][2][3]. One of the applications includes automating the ripeness classification of fruits either in the plantation area for assisting in harvesting decision, or in the factory for sorting the fruits before packaging process. Most of the applications use color as the parameter to determine the ripeness class. The process of color recognition for fruit ripeness classification involves the extraction of useful information concerning the spectral properties of the fruit surface and discovering the best classification method for recognizing the ripeness of the fruit. Oil palm industry is one of the major agricultural industries in Malaysia. The most common species of oil palm that can be found is Elais guineensis of nigrescens type. The oil palm fresh fruit bunch (FFB) is the main product that is harvested from the oil palm plantation to produce oil with variety of usages. Its surface color varies from deep violet to orange depending on the ripeness category. To get the optimum oil yield, it is crucial that the FFBs are harvested at the optimum ripeness stage [4]. Presently, harvesting decision of oil palm FFB is carried out manually by human graders who assess the FFB quality by using their vision and subjective judgment based on the number of loose fruits under the oil palm tree and the color of the FFB fruit surface [4][5]. These methods of judgment might be inaccurate due to different human perception or lack of skills. There have been several studies conducted for oil palm ripeness identification. Jamil et al. [6] developed an intelligent oil palm FFB grading by training RGB values of 45 FFB images using neuro-fuzzy technique. This technique yielded 73.3% correct classification. Meanwhile, May and Amaran [7] used fuzzy logic to classify the oil palm fruits by using the same attributes, which yielded 86.67% correct classification. However, RGB values only suitable to be used for constant lighting environment since they are affected by changing light intensities [8]. Thus, studies by [9], [10] and [11] used hue value as the parameter to determine the ripeness of oil palm fruits and FFBs. It was shown that there was a good correlation between hue value and ripeness stage [11]. This paper discusses the techniques involved in the development of automated ripeness classification system of oil palm FFB using artificial neural network (ANN) to assist the harvester in making a decision ISBN: 978-960-474-368-1 51
whether to cut-off the FFB from the tree. The techniques involved in the proposed system include image processing, color feature extraction, and classification. 2 Methodology In this section, the stages involved in the development of oil palm FFB ripeness classifier are discussed. The steps taken for this work include image acquisition, image segmentation, color feature extraction and classification. (a) 2.1 Image Acquisition The oil palm FFB samples used in this study were sourced from Felda Agricultural Services Sdn Bhd, Jengka, Pahang. Since this work focused on preharvest stage, the image samples were taken from tree top using a digital IP camera that was mounted on top of a pole. A total of 752 images of oil palm FFBs were taken at random for 5 days, between 9am and 4pm. The time was chosen due to clear visibility and it was within the harvesters working hour. In order to determine the ripeness category for each sample, a trained grader would assess the ripeness using manual technique. All the images obtained were stored in a computer for further analyses. 2.2 Image Segmentation An oil palm FFB image (see Figure) shows that there are two distinct regions, which consisted of spikes and fruits. In this work, only the fruits region was used to extract the color for ripeness determination. Therefore, both spikes and fruits pixels were separated by using k-means clustering algorithm used by Jaffar et al. [12]. An example of segmented image is shown in Figure 1. 2.3 Color Feature Extraction Hue color space has shown to be a good discriminator for oil palm fruit color compared to RGB or CIExy values [13]. Hence, in this work, hue values for all fruit pixels were calculated as in equation 1, where r, g and b represent the red, green and blue components of the image, respectively. This would result to h values in the range of [0,360]. To reduce the index of the hue values, a hue histogram of 100 bins was obtained for each image. (b) Figure 1: A sample of the (a) original and (b) segmented images. h = cos 1 1 2 [(r g)+(r b)] [(r g) 2 +(r b)(g b)] 1/2 if b g 360 cos 1 1 2 [(r g)+(r b)] [(r g) 2 +(r b)(g b)] 1/2 if b > g From the 100 bins, only 57 hue values were included as the colors of the oil palm FFB fruit surface. These include hues 1 to 9 (red to orange) and hues 53 to 100 (blue to red). Since there is a discontinuity between the red hues, the hue values of 1 until 9 were shifted to the back of the histogram. This resulted to a feature vector as shown in equation 2. H = ( h 53 h 54 h 100 h 1 h 9 ) (2) The hue values H were further reduced using principal component analysis (PCA) and stepwise discriminant analysis (SDA) methods. In PCA method, the hue values for training data set were first normalized, so that they have zero mean and unity variance. Then, the normalized hue values, mean and variance were used to compute the principal components using SVD method. This generated a transformation matrix, T ransmat and produced a transformed set of measurements, N trans which consisted of uncorrelated (1) ISBN: 978-960-474-368-1 52
components. The matrix T ransm at was stored for other independent data set (e.g: test data). N trans were ordered according to the magnitude of their variances [14]. PCA transformation reduced the number of hue values by retaining only those components that contribute more than a specified percentage value of the total variation in the data set. For example, if the percentage value of the total variation in the data set of 10% is specified, the components that contributed to less than 10% of the total variation in the data set would be eliminated. This would leave to a number of p uncorrelated components. In SDA method, a hue subset containing the best features were obtained by using Wilks Λ selection criteria. First, the Wilks Λ statistic, Λ(h i ) for each individual variable in H was calculated and the variable with minimum Λ(h i ) was chosen to calculate Λ(h i h 1 ) for each of the variable not entered at the first step where h 1 indicated the first variable entered. An F-statistic was applied to test the significance change from the first variable and the next. The step-by-step calculation was explained in [15]. At each stage in SDA, the hue variables whose F- statistics are smaller than F-to-remove were removed, while retaining the hue values whose F-statistics were greater than F-to-enter value. In this work, the F-toremove and F-to-enter values chosen were 2.71 and 2.84, respectively. A subset of new hue values, H which contained the retained hue values, was obtained from this method. 2.4 Classification Multilayer perceptron (MLP) neural network is a very common ANN architecture, used to solve many classification problems. Thus, this work employed MLP for classification of the ripeness of oil palm FFB. The MLP consisted of three layers; input, hidden and output layers, which comprised a number of processing elements (PE). The structure of the MLP is as shown in Figure 2. A total of 752 hue distribution data were randomly divided into 456 training data, 96 validation data and 200 test data. Separate MLP networks with various combinations of activation functions for hidden and output neurons were trained using Levenberg- Marquardt training algorithm. The number of hidden neurons was determined experimentally, and 15 hidden neurons were attained at the optimum performance. In the training process, the MLP network the trained weights in the input and hidden layers were updated after every training cycle to improve the performance. The validation data were used to validate MLP performance by terminating the training process when there was no improvement in the validation performance. The best-performed MLP model was selected based on the highest classification accuracy (HCA) of the test data set obtained from the percentage of m correct classification in the set of 200 test data, calculated as equation 3. HCA = max[( m ) 100%] (3) 200 Figure 2: Structure of MLP neural network. There were 3 methods experimented in developing an optimum MLP classifier. These methods were MLP classifications by using 3 different types of input data which include raw data, PCA data and SDA data. The methods were called as MLP, PCA+MLP and SDA+MLP, respectively. The best performed MLP were compared to determine which method gave the highest classification accuracy. 3 Results and Discussions Figure 3 illustrates the hue distribution samples for four ripeness categories of oil palm FFB. Each hue distribution represented the color of the fruit surface of the oil palm FFB and they agreed with visual observation. Unripe FFB had shown the highest peak at hue 64 which indicated that the unripe FFB fruit surface was mostly in blue color. For underripe FFB, there were two peaks formed at hue 65 and 99 which represented blue and red respectively. The hue peaks for both ripe and overripe were at hue 99 and 100 respectively. Most of the pixels for ripe FFB were red in color, whereas most of the pixels for overripe FFB were red to orange color. ISBN: 978-960-474-368-1 53
classifier, which resulted in 94% correct classification accuracy. Acknowledgements: The research was supported and funded by University Sains Malaysia and Felda Agricultural Services Sdn Bhd. In the case of the first author, it was also supported by Universiti Malaysia Pahang. Figure 3: Hue distribution samples for four ripeness categories of oil palm FFB. Table 1 shows the overall MLP accuracy for each method. Although the number of features were reduced from 57 to 10 by using PCA, the MLP classification accuracy was still the same. PCA managed to discard the correlated components and keep the important features, while retaining MLP performance. Meanwhile, from SDA method for feature reduction, 12 features were found to be significant for MLP inputs. These features had shown as good parameters for ripeness classification using MLP since the overall accuracy had increased from 83.5% to 94%. Table 1: Classification accuracies for three different methods. Method No. of features Accuracy(%) MLP 57 83.5 PCA+MLP 10 83.5 SDA+MLP 12 94 4 Conclusion In this work, the algorithm for the automated ripeness classification of oil palm FFB had been successfully developed. The performance of MLP had been investigated for classification purpose by using hue measurements of the oil palm FFB images. A total of 57 hue measurements were obtained for each image and these values were used to characterize the ripeness of oil palm FFB by using MLP. Besides using the full 57 color features for MLP inputs, two other types of inputs that were obtained from PCA and SDA had also been investigated. The overall results concluded that SDA has seccessfully chosen 12 best features for MLP ISBN: 978-960-474-368-1 54 References: [1] V. G. Narenda and K. S. Hareesh, Quality inspection and grading of agricultural and food products by computer vision a review, International Journal of Computer Applications, vol. 2, 2010, pp. 43 65. [2] R. Chinchuluun, W. S. Lee, J. Bhorania, and P. M. Pardalos, Clustering and classification algorithms in food and agricultural applications: a survey, In Advances in Modeling Agricultural Systems, Springer, vol. 25, 2009, pp. 433 454. [3] M. Dadwal and V. K. Banga, Color image segmentation for fruit ripeness detection: a review, International Conference on Electrical, Electronics and Civil Engineering, ICEECE 2012, 2012, pp. 190 193. [4] A. H. Hitam and A. M. Yusof, Mechanization in oil palm platations, In Advances in Oil Palm Research, Malaysian Palm Oil Board, vol. 1, 2000, pp. 653 696. [5] P. Junkwon, T. Takigawa, H. Okamoto, H. Hasegawa, M. Koike, K. Sakai, J. Siruntawineti, W. Chaeychomsri, N. Sanevas, P. Tittinuchanon, and B. Bahalayodhin, Potential application of color and hyperspectral images for estimation of weight and ripeness of oil palm (Elaeis guineensis Jacq. var. tenera), Agricultural Information Research, vol. 18, 2009, pp. 72 81. [6] N. Jamil, A. Mohamed and S. Abdullah, Automated grading of palm oil fresh fruit bunches(ffb) using neuro-fuzzy technique, International Conference of Soft Computing and Pattern Recognition, SOCPAR 09, 2009, pp. 245 249. [7] Z. May and M. H. Amaran, Automated oil palm fruit grading system using artificial intelligence, Int. J. Eng. Sci., vol. 11, 2011, pp. 30 35. [8] R. Hudzari, W. W. Ishak and M. Norman, Parameter acceptance of software development for oil palm fruit maturity prediction, J. Softw. Eng., vol. 4, 2010, pp. 244 256.
[9] L. C. Guan, Stepwise discriminant analysis on oil palm fruit s hues for ripeness grading using machine vision system, Master s thesis, School of Electrical and Electronic Engineering, Universiti Sains Malaysia, 2005. [10] Y. A. Tan, K. W. Low, C. K. Lee and K. S. Low, Imaging technique for quantification of oil palm fruit ripeness and oil content, In Eur. J. Lipid Sci. Tech., 2010, pp. 838 843. [11] W. I. W. Ishak and M. H. Razali, Hue optical properties to model oil palm fresh fruit bunches maturity index, World Multi-Conference on Systemics, Cybernetics and Pattern Recognition, 2010. [12] A. Jaffar, R. Jaafar, N. Jamil, C. Y. Low and B. Abdullah,Photogrammetric grading of oil palm fresh fruit bunches, International Journal of Mechanical & Mechatronics Engineering, vol. 9, 2009, pp. 18 24. [13] M. Abdullah, L. Guan, A. Mohamed and M. Noor, Color vision system for ripeness inspection of oil palm Elaies guineensis, J. Food Process Pres., vol. 26, 2002, pp. 213 235. [14] H. B. Demuth and M. H. Beale, Neural Network Toolbox for Use with MATLAB: User Guide, Math Works Inc, 2004. vspace-7pt [15] A. C. Rencher, Methods of Multivariate Analysis, Wiley-Interscience, 2nd ed., 2002. ISBN: 978-960-474-368-1 55