White Intensity = 1. Black Intensity = 0

A Region-based Color Image Segmentation Scheme N. Ikonomakis a, K. N. Plataniotis b and A. N. Venetsanopoulos a a Dept. of Electrical and Computer Engineering, University of Toronto, Toronto, Canada b School of Computer Science, Ryerson Polytechnic University, Toronto, Canada ABSTRACT A color image segmentation technique is presented for use in coding and/or compression of video-conferencing sequences. The proposed technique utilizes the perceptual HSI (hue,saturation,intensity) color space. The eectiveness of the scheme is improved by rst splitting the pixels in the image into chromatic and achromatic regions using a classication method. A region growing scheme is then employed to each of the set of chromatic and achromatic pixels to segment the image. For the achromatic pixels a simple intensity dierence metric is used. For the chromatic pixels three distance metrics were compared. Results are shown for three video-conferencing type images. Keywords: color image segmentation, HSI, region growing 1. INTRODUCTION Image analysis usually refers to the processing of images by computers with the goal of nding what objects are presented in the images. Image segmentation refers to partitioning an image into dierent regions that are homogeneous or "similar" in some image characteristic. It is an important rst task of any automated image analysis process because all subsequent tasks, such as feature extraction and object recognition, rely heavily on the quality of the segmentation. For example, without a good segmentation algorithm an object may never be recognizable. In this way, the segmentation step determines the eventual success or failure of the analysis. For this reason, considerable care is taken to improve the probability of a successful segmentation. There are currently a large number of image segmentation techniques that are available today. However, no general methods have been found that perform adequately across a varied set of images. The early attempts at gray-scale image segmentation can be categorized into three general groups: pixel-based, 1{4 region-based, 5,6 and edge-based techniques. 7 Even though these techniques were introduced three decades ago, they still nd great attention in color image segmentation research today. 8{11 Model-based segmentation techniques, 12,13 in which the image regions are modeled as random elds and the problem is posed as a statistical optimization problem, have also become popular in the last decade. Most attention on image segmentation has been focused on gray-scale images. A common problem in segmentation of a gray-scale image occurs when an image has a background of varying gray level such as gradually changing shades, or when regions assume some broad range of gray levels. Segmentation results might be negatively aected by the presence of intensity changes due to shadows and surface curvature in the gray-scale images. These problems are inherent since intensity is the only available information from gray-scale images. It is known that the human eye can detect only in the neighborhood of one or two dozen intensity levels at any point in a complex image due to brightness adaptation, but can dierentiate thousands of color shades and intensities. A region-based color image segmentation scheme is proposed that employs the perceptual HSI (Hue, Saturation, Intensity) color space to segment color images. More specically, the proposed scheme is developed for implementation on video-conferencing sequences where segmentation is used in the coding and/or compression of the sequences. Besides video-conferencing, image segmentation has taken a central place in numerous applications, including, but not limited to, multimedia databases, color image and video transmission over the Internet, digital broadcasting, interactive TV, video-on-demand, computer-based training, distance education, and tele-medicine. Further author information: (Send correspondence to N.I.) N.I.: E-mail: minoas@dsp.toronto.edu K.N.P.: E-mail: kplatani@acs.ryerson.ca A.N.V.: E-mail: anv@dsp.toronto.edu

White Intensity = 1 gray-scale R purely saturated G S Θ Intensity B Hue = Θ Saturation = S Black Intensity = 0 Figure 1. HSI color model. The segmentation algorithm can be split into two general steps. First the pixels in the image are classied as chromatic or achromatic pixels by examining their HSI color values. Next the region growing algorithm is employed on the chromatic and achromatic pixels separately using dierent color similarity measures. For the segmentation of the chromatic pixels three distance measures are compared: the generalized Minkowski metric, 14 the Canberra metric, 15 and a metric presented in Ref. 16. Experimental results from the Claire, Carphone, and Mother Daughter images are shown to demonstrate the eectiveness of the algorithm. 2. COLOR IMAGE SEGMENTATION The algorithm presented utilizes the HSI (hue, saturation, intensity) color space, and thus, the color values of the pixel are rst converted from the standard RGB (red, green, blue) color values to the HSI color values using well known transformation formulas. 17 2.1. Chromatic/Achromatic Separation The HSI color model corresponds closely to the human perception of color. It can be represented with the cone model shown in Figure 1. 17 The hue value is measured by the angle around the vertical axis, has a range of values between 0 and 360, and gives a measure of the spectral composition of the color. The saturation is a value that ranges from 0 (on the axis) outwards to a maximum value of 1 on the sides of the cone and refers to the proportion of pure light of the color. The intensity also ranges between 0 and 1 and is a measure of the brightness. The hue value of the pixel has the greatest discrimination power among the three values because it is independent of any intensity attribute. Even though hue is the most useful attribute, there are two problems in using this value: hue is meaningless when the intensity is very low or very high; and hue is unstable when the saturation is very low. 17 Because of these attributes, in the proposed scheme, the image is rst divided into chromatic and achromatic regions by dening eective ranges of hue, saturation, and intensity values in the image. Since the hue value of a pixel is meaningless when the intensity is very low or very high the achromatic pixels in the image are dened as the pixels that have low or high intensity values. Pixels can also be categorized as

achromatic if their saturation value is very low since hue is unstable for low saturation values. From the concepts discussed above, the achromatic pixels in the HSI color space are dened as follows: achromatic pixels: (intensity > 90) or (intensity < 10) or (saturation < 10), (1) where the saturation and intensity values are normalized from 0 to 100. Only the intensity values of the achromatic pixels are considered when segmenting the achromatic pixels into regions. Pixels that do not fall into this category are categorized as chromatic pixels. For chromatic pixels all three color values are considered in the algorithm. 2.2. Region Growing A region growing algorithm, which we have developed in the past, 9 is employed to segment a color image into disjoint regions by segmenting the chromatic pixels separately from the achromatic ones. The approach starts with a set of seed pixels and from these grows regions by appending to each seed pixel those neighboring pixels that satisfy a certain homogeneity criterion which will be described later. The process starts by assigning the rst pixel of the image under consideration as the rst seed pixel. This seed pixel is then compared to its 8-connected neighbors: eight neighbors of the seed pixel. Any of the neighboring pixels that satisfy a homogeneity criterion are assigned to the rst region. This neighbor comparison step is repeated for every new pixel assigned to the rst region until the region is completely bounded by the edge of the image or by pixels that do not satisfy the criterion. The color of each pixel in the rst region is changed to the average color of the all the pixels in the region. The next seed pixel for the second region would be determined by choosing the rst unassigned (to the previously grown region) pixel while moving through the image in a right-to-left and top-to-bottom fashion. The above mentioned steps for growing a region are once again applied until the second region becomes complete. This process is repeated until every pixel in the image belongs to a region. The algorithm is summarized in the following steps: 1. Choose next seed pixel by choosing the rst unassigned pixel while moving through the image in a right-to-left and top-to-bottom fashion. Seed pixel is the rst pixel of the image at start. 2. Test to see if the 8-connected neighboring pixels of the seed pixel belong to the region by comparing with a homogeneity criterion. 3. If any of the neighboring pixels satisfy the condition, they are assigned to the region, step 2 is repeated, and their 8-connected neighbors are considered and tested for homogeneity. 4. When the region is grown to its maximum (i.e. no neighbors of the pixels on the edge of the region satisfy the criterion), assign each pixel in the region the average color of all the pixels in the region and go to step 1. In the case of the achromatic pixels, the homogeneity criterion used is that if the dierence in the intensity values between an unassigned pixel and the seed pixel is less than a threshold value T achrom than the pixel is assigned to the seed pixel's region. That is, if ji s? I i j < T achrom ; (2) then the pixel i would be assign to the region of the seed pixel s. For the chromatic pixels, if the value of the distance metric used to compare the unassigned pixel and the seed pixel is less than a threshold value T chrom than the pixel is assigned to the region. Varying the value of T chrom controls the degree of segmentation with a low value resorting to an over-segmented image and a high value to an under-segmented image. Three dierent distance measures are compared: the generalized Minkowski metric, the Canberra metric, and a metric presented in Ref. 16 which will be referred to as the Cylindrical distance metric. The most commonly used measure to quantify distance between two vectors is the generalized Minkowski metric (L p norm). It is dened as follows 14 : d M (i; j) = px! 1 p (x k i? x k j ) p (3) k=1 where p is the dimension of the vector x and x k i is the k th element of x. The metric is dened for the HSI color space, as follows: d M (s; i) =? jh s? H i j a + js s? S i j b + ji s? I i j c 1 d (4)

where s and i refer to the seed pixel and the pixel being tested, respectively. A pixel is assigned to a region if the value of the metric d M is less than a threshold T chrom. An advantage of using this metric is that more emphasis can be put on the three dierent HSI color values by having dierent values for the powers a, b, and c. An array of values are tested for the three parameters a, b, and c ranging from 1 to 5. These are tested against varying threshold values T chrom. The parameter d is set to 3=(a + b + c) to justify the performance comparison of the varying powers a, b, and c. The Canberra metric applies only to non-negative multivariate data, which is the case when color vectors are considered. It is dened, for the HSI color space, as follows 15 : d can (s; i) = jh s? H i j jh s + H + js s? S i j i j js s + S + ji s? I i j i j ji s + I i j (5) where, once again, s and i refer to the seed pixel and the pixel being tested, respectively. A pixel is assigned to a region if the value of the metric d can is less than a threshold T chrom. A benet of using this metric is that the ranges taken for the three color values H, S, I do not aect the result of the segmentation. The Cylindrical metric was presented in a paper by Tseng et al.. 16 This metric computes the distance between the projections of the pixel points on a chromatic plane. It is dened as follows 16 : d cyl (s; i) =? (d I ) 2 + (d C ) 2 1 2 ; (6) with and where = d I = ji s? I i j (7) d C =? (S s ) 2 + (S i ) 2? 2S s S i cos 1 2 ; (8) jh s? H i j if jh s? H i j < 180 ; 360? jh s? H i j if jh s? H i j > 180 : (9) The value of d C is the distance between the 2-dimensional (hue and saturation) vectors, on the chromatic plane, of the seed pixel and the pixel under consideration, as shown in Figure 2. Henceforth, d C combines both the hue and saturation (chromatic) components of the color. An examination of the metric equation (Equation (6)) shows that it can be considered as a form of the Minkowski metric in Equation (3). More specically it can be considered as the Euclidean distance (L 2 norm) metric corresponding to p = 2 in Equation (3). A pixel is assigned to a region if the value of the metric d cyl is less than a threshold T chrom. 3. EXPERIMENTAL RESULTS The performance of the region growing segmentation algorithm was tested with a number of dierent images using all three distance metrics. The original images of the Claire, Carphone, and Mother Daughter images are displayed in Figures 3(a), 4(a), and 5(a). The homogeneity criterion used for the achromatic pixels was uniform for all three algorithms run. If the dierence in intensity values between the seed pixel and the pixel under consideration is less than a threshold value T achrom then the pixel is assigned to the region belonging to the seed pixel. It was found that the best results were obtained with a threshold value of 15. This was the the value used in the three algorithms for the dierent chromaticity distance metrics. With the generalized Minkowski distance metric a variety of values were used for the parameters a; b; c; and T chrom. Figures 3(b) shows the segmented image of Claire with a = b = c = 2 and T chrom = 15. As can be seen, the results with these parameter values are not very good. The lips of the character in the image are lost. It was found that if more emphasis was put on the saturation component of the pixels and less emphasis was put on the hue component better results are obtained. That is, better results are obtained for b > c > a. Figure 3(c) shows the segmented image of Claire with a = 2; b = 4; c = 3; and T chrom = 15. The segmented image is better than the previous one. Through experimentation, it was found that the best results with this metric were obtained

R G S i θ pixel under consideration Ss seed pixel B Figure 2. The chromatic plane of the HSI color model. for parameter values of a = 1; b = 3; c = 2; and T chrom = 15. Figures 3(d), 4(b), and 5(b) show the segmented images for these parameters. These images show the eectiveness of the segmentation algorithm with the Minkowski distance metric. The results obtained when using the Canberra distance metric were not very dierent than the ones obtained with the Minkowski metric. Figures 3(e), 4(c), and 5(c) show the segmented images of Claire, Carphone, and Mother Daughter, respectively, with T chrom = 0:25. The segmented Carphone image with the Canberra metric (Figure 4(c)) and the segmented image with the Winkowski metric (Figure 4(b)) are exactly the same. The other images show a slightly better results with the Canberra metric. With the Cylindrical distance metric the best results were obtained. Figures 3(f), 4(d), and 5(d) show the segmented images of Claire, Carphone, and Mother Daughter, respectively, with T chrom = 15. Each of the segmented images have a slight improvement over the results obtained using the other metrics. Over a varied set of images the Cylindrical metric gave the best segmentation results. 4. CONCLUSIONS A color image segmentation scheme using the HSI color space was used to segment video-conferencing type images. The technique was found to be robust and relatively computationally inexpensive. The eectiveness of the scheme was improved by classifying the pixels in the image as either chromatic achromatic depending on their intensity and saturation values. Each dierent group would then be segmented via the region growing algorithm. Three distance metrics were compared for the segmentation of the chromatic pixels. Of the three, the Cylindrical metric showed the most promise. It gives the best results over a varied set of images. This may be due to the relationship of the hue and saturation components in the metric. This scheme is being further enhanced to incorporate a better method of selecting the seed pixels and a merging algorithm that will merge (in color) disjoint regions that have a similarity in color. This merging algorithm will be used to further decrease the number of colors in the image.

(a) (b) (c) (d) (e) (f) Figure 3. Claire Image. (a) Original (b) Segmented using Minkowski metric with a = b = c = 2; and T chrom = 15. (c) Segmented using Minkowski metric with a = 2; b = 4; c = 3; and T chrom = 15. (d) Segmented using Minkowski metric with a = 1; b = 3; c = 2; and T chrom = 15. (e) Segmented using Canberra metric with T chrom = 0:25. (f) Segmented using Cylindrical metric with T chrom = 15.

(a) (b) (c) (d) Figure 4. Carphone Image. (a) Original (b) Segmented using Minkowski metric with a = 1; b = 3; c = 2; and T chrom = 15. (c) Segmented using Canberra metric with T chrom = 0:25. (d) Segmented using Cylindrical metric with T chrom = 15. (a) (b) (c) (d) Figure 5. Mother Daughter Image. (a) Original (b) Segmented using Minkowski metric with a = 1; b = 3; c = 2; and T chrom = 15. (c) Segmented using Canberra metric with T chrom = 0:25. (d) Segmented using Cylindrical metric with T chrom = 15.

REFERENCES 1. R. Ohlander, K. Price, and D. R. Reddy, \Picture segmentation using a recursive splitting method," Computer Graphics and Image Processing 8, pp. 313{333, 1978. 2. Y. Ohta, T. Kanade, and T. Sakai, \Color information for region segmentation," Computer Graphics and Image Processing 13, pp. 222{241, 1980. 3. K. Holla, \Opponent colors as a 2-dimensional feature within a model of the rst stages of the human visual system," Proc. of the 6th Int. Conf. on Pattern Recognition, pp. 161{163, 1982. 4. S. Tominaga, \Color image segmentation using three perceptual attributes," Proc. CVPR'86, pp. 628{630, 1986. 5. S. L. Horowitz and T. Pavlidis, \Picture segmentation by a directed split-and-merge procedure," in Visual Communications and Image Processing, Proc. of the SPIE 1818, pp. 1168{1181, 1974. 6. A. Tremeau and N. Borel, \A region growing and merging algorithm to color segmentation," Pattern Recognition 30-7, pp. 1191{1203, 1997. 7. A. Koschan, \A comparitive study on color edge detection," Proc. of the 2nd Asian Conference on Computer Vision III, pp. 574{578, 1995. 8. N. Pal and S. K. Pal, \A review on image segmentation techniques," Pattern Recognition 26-9, pp. 1277{1294, 1993. 9. N. Ikonomakis, K. N. Plataniotis, and A. N. Ventsanopoulos, \Grey-scale and colour image segmentation via region growing and region merging," Canadian Journal of Electrical and Computer Engineering 28-1,2, pp. 43{ 47, 1998. 10. M. Celenk, \A color clustering technique for image segmentation," Computer Vision, Graphics, and Image Processing 52, pp. 145{170, 1990. 11. M. Chapron, \A new chromatic edge detector used for color image segmentation," Proc. of the 11th Int. Conf. on Pattern Recognition III-C, pp. 311{314, 1992. 12. S. Geman and D. Geman, \Stochastic relaxation, gibbs distributions, and the bayesian restoration of images," IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-6 no. 2, pp. 721{741, 1984. 13. H. Derin and H. Elliott, \Modeling and segmentation of noisy and textured images using gibbs random elds," IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-9 no. 1, pp. 39{55, 1987. 14. R. O. Duda and P. E. Hart, Pattern Classication and Scene Analysis, John Wiley and Sons, New York, 1973. 15. A. D. Gordon, Classication, Chapman & Hall, London, 1981. 16. D. C. Tseng and C. H. Chang, \Color segmentation using perceptual attributes," Proc. 11th Int. Conference on Pattern Recognition III:Conf C., pp. 228{231, 1992. 17. R. C. Gonzales and R. Wood, Digital Image Processing, Addison-Wesley, Massachusetts, 1992.