Multiresolution Analysis of Connectivity Atul Sajjanhar 1, Guojun Lu 2, Dengsheng Zhang 2, Tian Qi 3 1 School of Information Technology Deakin University 221 Burwood Highway Burwood, VIC 3125 Australia atuls@deakin.edu.au 2 Gippsland School of Computing & Information Technology Monash University Northways Road Churchill, VIC 3842 Australia {guojun.lu, dengsheng.zhang}@infotech.monash.edu.au 3 Media Division Institute for Infocomm Research 21 Heng Mui Keng Terrace Singapore 119613 tian@i2r.a-star.edu.sg Abstract. Multiresolution histograms have been used for indexing and retrieval of images. Multiresolution histograms used traditionally are 2d-histograms which encode pixel intensities. Earlier we proposed a method for decomposing images by connectivity. In this paper, we propose to encode centroidal distances of an image in multiresolution histograms; the image is decomposed a priori, by connectivity. Multiresolution histograms thus obtained are 3d-histograms which encode connectivity and centroidal distances. The statistical technique of Principal Component Analysis is applied to multiresolution 3d-histograms and the resulting data is used to index images. Distance between two images is computed as the L2- difference of their principal components. Experiments are performed on Item S8 within the MPEG-7 image dataset. We also analyse the effect of pixel intensity thresholding on multiresolution images. 1. Introduction Multiresolution histogram is a family of histograms obtained for multiple resolutions of an image. Multiresolution histogram overcomes the inability of a single histogram to
encode the spatial features of images [7]. Multiresolution histogram of image intensities have been used extensively for retrieval of images and video from visual databases [2][7]. Multiresolution histogram is robust to noise. We use the concept of multiresolution histogram to encode centroidal distances of an image. Centroidal distance histogram is obtained by discretising the centroidal distance of each point in an image into a bucket. Centroidal distance of a point is obtained as the distance of the point from the centroid. Before obtaining centroidal distance histograms, however, images need to be normalised for scale. The method is inherently invariant to rotation and translation. Multiresolution histogram based on centroidal distances is the ground truth against which we compare the proposed approach. Based on the previously proposed concept of connectivity [1], we show how multiresolution histograms which encode centroidal distances and connectivity can be used effectively for shape-based retrieval of images. We evaluate the proposed method against the traditional approach (described above) which does not use connectivity. The proposed method is described in Section 2. Experimental Results are presented in Section 3. Finally, Discussion and Conclusion are presented in Sections 4 and 5 respectively. 2. Proposed Method In this section, we describe the proposed method for image retrieval. The proposed method is based on connectivity [1]. First, we briefly explain connectivity. Connectivity is used to decompose images based on the state of the nearest 8-neighbour pixels. Consider a sample image shown in Fig. 1(a), we refer to the dark pixels as OFF. The state of the nearest 8-neighbours is computed for each OFF pixel. Connectivity of an OFF pixel is obtained as the number of OFF pixels amongst the nearest 8-neighbours. Figure 1(b) provides additional information for the image in Figure 1(a). For each OFF pixel within the image, the connectivity can take values 0 through 8. A connectivity of 0 indicates that none of the nearest 8-neighbours are OFF. A connectivity of 8 indicates that all of the nearest 8-neighbours are OFF. (a) (b) Fig. 1. Image decomposition by connectivity
Consider the image in Fig. 1(a); multiresolution images for the image are shown in Fig. 2. These are obtained by convolving the original image with a Gaussian filter. Centroidal distances are obtained for each resolution of an image which is decomposed by connectivity. 3d-histograms are computed which encode centroidal distances and connectivity. Multiresolution histograms h is a family of 3d-histograms for multiple image resolutions. Feature vector for an image consists of a family of 3d-histograms. The dimensionality of the feature vector will depend on the number of resolutions used for an image. (a) (b) Fig. 2. Multiple resolutions of sample image Principal Component Analysis (PCA) is a statistical approach for reducing the dimensionality of data [3][4][5]. We apply PCA to feature vectors and the resulting low dimensional data is used to index images. PCA involves a mathematical procedure that transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. We briefly explain the theory behind PCA. Performing PCA is the equivalent of performing Singular Value Decomposition (SVD) on the covariance matrix of the data. Singular value decomposition for the data matrix A is computed as U S V'. Matrices U and V are such that they are orthogonal. The columns of U are called left singular values and the rows of V' are called right singular values. Eigenvectors and eigenvalues of A.A' and A'.A need to be calculated to obtain matrices U and V. Multiplications of A by its transpose results in square matrices. The columns of V are made from the eigenvectors of A'.A and the columns of U are made from the eigenvectors of A.A'. The eigenvalues obtained from the products of A.A' and A'.A, when square-rooted, make up the columns of S. The diagonal of S is said to be the singular values of the original matrix, A. Each eigenvector described above represents a principle component. PC1 (Principle Component 1) is defined as the eigenvector with the highest corresponding eigenvalue. The individual eigenvalues are numerically related to the variance they capture via PC's - the higher the value, the more variance they have captured. The outcome of PCA on multiresolution 3dhistograms for image in Fig. 1 is shown in Fig. 3. The first eigenvector (P1) has the largest eigenvalues to the direction of the largest variance. The second (P2) and the third (P3) eigenvectors are orthogonal to the first one.
In the example shown in Fig. 3, the first eigenvalue for the first eigenvector is λ 1 = 81.65. The other eigenvalues are λ 2 = 5.3 and λ 3 = 1.54. Thus, the first eigenvector contains almost all the energy. The data could thus be approximated in one-dimension. Fig. 3. Principal components of multiresolution histograms Consider the sample image shown in Fig. 4(a). This image has only two pixel intensities, 0 and 255. Centroidal distances are computed for pixel intensities of 255 only. The corresponding 3d-histogram is shown in Fig. 4(b). (a) Fig. 4. (a) Sample image (b) Distance histogram for Sample Image The image in Fig. 5(a) is a Gaussian blurred image and has pixel intensities ranging from 0-255, as shown in Fig. 5(b). The distance histogram in Fig. 5(c) is obtained by computing the centroidal distances for all pixel intensities in the range 1-255; only the black pixels (intensity=0) are ignored. The pattern in Fig. 5(c) is significantly different from that in Fig. 4(b). This is because of the fineness of the shape [2]. Rates of change of histogram densities are significant for images with fine regions. Hence, we consider pixel intensity thresholding. The 3dhistogram when using pixel intensity thresholding is shown in Fig. 5(d). In Fig. 5(d), pixels which have intensity less than 55 are ignored i.e. their centroidal distances do not contribute to the histograms. The generic process of indexing is illustrated in the diagram below. (b)
pixel intensity distribution 300 250 200 150 100 50 (a) 0 1 50 99 148 197 246 (b) (c) Fig. 5. (a) Image after Gaussian blur (b) Pixel intensity distribution (c) 3d histogram for Gaussian blurred image (d) 3d histogram after pixel intensity thresholding (d) Fig. 6. Image Indexing Multiple resolutions of each image is obtained using Gaussian filters. Each resolution of the image is decomposed by connectivity. The family of 3d-histograms is obtained for each image. Within the histogram family each histogram encodes the connectivity and the centroidal distances for a particular resolution of the original image. The process of obtaining histograms is preceded by pixel intensity thresholding. Pixel intensity thresholding is especially useful for fine shapes which would otherwise have significant rates of change in histogram densities. Principal components are obtained for the
histogram family and used as an index for each image. During querying, the distance between two images is computed as the L2-difference of their principal components. 3. Experimental Results In this section, we evaluate the performance of the proposed method. We compare the performance of multiresolution histograms obtained by the traditional method and the proposed method. Experiments are performed on Item S8 within the MPEG-7 Still Images Content Set [6]. This is a collection of trademark images and originally provided by the Korean Industrial Property Office. Item S8 consists of 3621 still images. It is divided into Sets A1, A2, A3, A4 to test the robustness of methods to geometric and perspective transformations. Fig. 7 below shows the results of retrieval experiments on Sets A1, A2, A3, A4 of the dataset. Experiments are performed for the proposed method and the traditional method. The proposed method, based on connectivity is prefixed with 3d mra in the legend. The traditional approach which is based on 2d-histograms is prefixed with 2d mra in the legend. When computing histograms for multiple resolutions of an image, pixel intensity thresholding may be required. Results for Set A1 are obtained with and without pixel intensity thresholding. Results for Sets A2, A3, A4 are obtained after pixel intensity thresholding.
Fig. 7. Average Recall-Precision Plots 4. Discussion Improvement of the proposed method when compared with the traditional method is conclusive. The reason for the improvement of the proposed method is attributed to additional information captured by connectivity; descriptors which encode connectivity are able to discriminate better between shapes [1]. We note that the dataset does not contain fine contours. In Fig. 1, we see that the pixel density is high for connectivity=0 and connectivity=8. We believe that the relative improvement in the effectiveness of the proposed method will be more with an increase in pixel densities for intermediate values of connectivity. In the future, we will perform experiments on different datasets to test the veracity of the statement above. Computational expense of the proposed method also needs to be addressed. The proposed method requires more processing compared with the traditional approach. Additional processing is required for decomposition of images by connectivity. Computational complexity for obtaining connectivity of an image is O(n) where n is the number of foreground pixels in the image. In applications where accuracy of retrieval is important, the improvement in effectiveness may outweigh the additional processing cost.
5. Conclusion We have proposed a novel method for shape representation and retrieval based on combination of connectivity and multiresolution histograms. We propose to use 3dhistograms which encode connectivity and centroidal distances of images. Experiments performed show the effectiveness of the proposed method. We also show the sensitivity of pixel intensity thresholding on the accuracy of retrieval. The degree of sensitivity to pixel intensity thresholding will depend on the image database and the nature of queries. A large number of fine shapes will require careful computation of pixel intensity threshold. In this paper, multiple image resolutions are each decomposed by connectivity and then encoded using 3d-histograms. Given multiple image resolutions which are decomposed by connectivity, the feature space can be encoded in any conventional technique for image indexing. References [1] A. Sajjanhar, G. Lu, D. Zhang, Discriminating Shape Descriptors Based on Connectivity, IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 2004. [2] E. Hadjidemetriou, M. D. Grossberg and S. K. Nayar, Multiresolution Histograms and their Use for Texture Classification, International Workshop on Texture Analysis and Synthesis, Nice, France, October 2003. [3] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, 2002. [4] I. T. Jolliffe, Principal Component Analysis, Springer Verlag, 1986. [5] Dejan Vranic, 3D Model Retrieval, University of Leipzig, PhD Thesis, 2004. [6] http://ipsi.fraunhofer.de/delite/projects/mpeg7/ [7] E. Hadjidemetriou, M. D. Grossberg and S. K, Nayar, Multiresolution Histograms and their Use for Recognition, IEEE transactions on Pattern Analysis and Machine Intelligence, Vol. 26, No. 7, July 2004.