Content-based Color Image Retrieval Based on Statistical Methods using Multiresolution Features A thesis submitted in partial fulfilment of the requirements for the award of the degree of Doctor of Philosophy in Computer Science and Engineering by M. KAMARASAN, Roll No. : 0736050002 Under the guidance of Dr. K. SEETHARAMAN, M.Sc., M.S., Ph.D., Associate Professor of Computer Science and Engineering Department of Computer Science and Engineering Faculty of Engineering and Technology ANNAMALAI UNIVERSITY Annamalainagar 608 002 Tamil nadu, India June, 2014
ANNAMALAI UNIVERSITY CERTIFICATE This is to certify that the thesis entitled Content-based Color Image Retrieval Based on Statistical Methods using Multiresolution Features is a bonafide record of the research work done by Mr. M. KAMARASAN, Research Scholar, Department of Computer Science and Engineering, under my guidance for the award of the degree of Doctor of Philosophy in Computer Science and Engineering during the period 2007-2014 and that this thesis has not previously formed the basis for the award of any degree, diploma, associateship, fellowship or other similar title to the candidate. This is also to certify that the thesis represents the independent work of the candidate. Annamalainagar Date: Research Guide Dr. K. SEETHARAMAN Associate Professor Department of Computer Science and Engineering Annamalai University
DECLARATION I hereby declare that the thesis entitled Content-based Color Image Retrieval Based on Statistical Methods using Multiresolution Features submitted by me, for the award of Degree of Doctor of Philosophy in Computer Science and Engineering, Annamalai University is the result of my original and independent work done during the period from 2007 to 2014 in the Department of Computer Science & Engineering, Annamalai University, Annamalainagar. This work has not formed the basis for the award of any degree, diploma, associateship, fellowship or similar title of this University or any other Universities at previous occasions. Annamalainagar Date: (M.KAMARASAN)
Acknowledgments Writing a Ph.D thesis is a daunting task. While personal perseverance is paramount, the task cannot be fulfilled without a kind and ardent mentor. I would like to take this opportunity to thank my guide Dr. K. Seetharaman, Associate Professor of Computer Science and Engineering, for his patient, inspirational supervising throughout the thesis writing process. In many stages of the writing process, his profound expertise and professional knowledge provides crucial and key injection to the technical solutions. The regular discussions with him have helped me to learn the independent research. I am grateful to Dr. V. Ramalingam, Professor and Head, Department of Computer Science and Engineering, for his constant encouragement. I thank him for the excellent research environment he has created to learn and pursue my research work. I sincerely thank my friends and research colleagues Mr.R.Ragupathy, Mr. S. Sathiamoorthy, Mrs. M. Thillaikarasi, and Mr. M. Jeyakarthic, Department of Computer Science and Engineering, for sparing their time with me to give moral support and help. I would like special thanks to Mr. G. Manikandan, Assistant Professor, School of Computing, SASTRA University, Thanjavur, for his valuable timely help. My deepest thanks to all the members of the Department of Computer Science & Engineering, Computer Science & Engg., Wing, DDE, and the persons involved in Annamalai University for their concern in the completion of my endeavor.
Great thanks to my wife, Dr. A. Shalini Maya, and my daughter, K.Theekshnakkriya for their immeasurable support and understanding. I wish they would be rewarded for the joys they have sacrificed for the rock support behind me during these seven years. I want them to know that this thesis is the work of the whole family, rather than mine individually. I express my sincere thanks to my father-in-law Prof. Dr. P Angamuthu and Mrs. Shanthi Anagamuthu, who supports me in every circumstance throughout my endeavor. My deepest thanks to my mother Mrs. M. Unnamalai Ammal, my brothers Er. M. Thirumal, Er. M. Ravichandran, Dr. M. Elavarasu and my sister Mrs. M Gowsalai for their love, care and support over all my academic carrier. My humble gratitude to almighty without whose mercy and grace, this endeavor would not be possible.. M.KAMARASAN
Contents Chapter No. Title List of Tables.................................. List of Figures.................................. List of Abbreviations............................. Abstract...................................... Page No. v vi ix xi 1 Introduction................................... 1 1.1 Background.................................... 1 1.2 Structure of CBIR system.......................... 4 1.3 Applications of CBIR............................. 7 1.4 Features for CBIR............................... 7 1.5 Similarity method and performance measurement....... 15 1.6 Related work................................... 17 1.7 Contribution of the thesis.......................... 36 1.7.1 Study of Haar wavelet based image feature............ 36 1.7.2 Study of Multiresolution with statistical features based 37 method........................................ 1.7.3 Study of Wavelet based orthogonal polynomial.......... 38 1.8 Outline of the thesis.............................. 39 2 Proposed Features Extraction Methods.............. 41 2.1 Overall structure of the proposed retrieval method....... 41 2.2 Image decomposition method....................... 45 2.2.1 Pyramidal structure.............................. 47
2.2.2 Construction of optimum level...................... 48 2.3 Haar Wavelet Based Image Feature................. 49 2.3.1 Subband characterization with Gaussian distribution 51 function....................................... 2.3.2 Feature extraction............................... 52 2.3.3 Image retrieval method........................... 54 2.4 Multiresolution with Statistical Features Based Method 58 2.4.1 Feature Extraction techniques...................... 64 2.4.2 Similarity measure............................... 73 2.5 Wavelet Based Orthogonal Polynomial Model......... 75 2.5.1 Multiresolution structure with orthogonal polynomial..... 76 2.5.2 Wavelet packet.................................. 78 2.5.3 Feature extraction................................ 81 2.5.4 Similarity measure............................... 88 3 Experimental Settings and Database Design.......... 90 3.1 Nature and texture image database................... 90 3.2 Diabetic retinopathy database (DRD)................. 93 3.3 Digital database for screening mammography........... 94 4 Results and Discussion........................... 95 4.1 Haar wavelet based image feature.................... 95 4.2 Multiresolution with statistical features based method.... 100 4.3 Wavelet based orthogonal polynomial method........... 106 5 Conclusion and Future Work...................... 114 Bibliography................................... 118 List of Publications.............................. 133
List of Tables Table No. Table Caption 2.3.1 The distance between the query image and image database Page No. using Bhattacharya method......................... 56 2.3.2 Radian and Degree of angle between query and target images........................................ 57 2.4.1 Computation of color histogram and color autocorrelation.. 68 2.4.2 The distance values between the query image and Vistex image database using Minkowski-form distance from pyramid image up to level 5........................ 73 3.1 Class of VisTex database........................... 91 3.2 Class of Corel Database............................ 91 4.1.1 Distance values of top relevant images against query image using Vistex database........................ 97 4.1.2 Feature vector dimension and its space of the retrieval methods........................................ 98 4.1.3 Target feature vector dimension of the retrieval methods... 98 4.1.4 Performance measure of the proposed system with existing methods........................................ 99 4.2.1 Performance measure of the proposed system with other existing methods................................. 103 4.2.2 Feature vector dimension, feature extraction, and searching time of the query image............................ 105 4.3.1 Performance measure of the proposed system with other existing methods................................. 111 4.3.2 Estimation of computation time of the proposed system... 113
List of Figures Figure No. Figure Caption Page No. 1.1 Two categories of image retrieval............... 2 1.2 Generic content-based image retrieval architecture. 5 1.3 Three categories of image feature.............. 6 2.1 Block diagram of proposed CBIR frameworks...... 44 2.2.1 Three-level pyramid decomposition structure...... 46 2.2.2 Decomposition schemes (a). Wavelet transform of an image pyramid decomposition scheme (b) Pyramid decomposition.............................. 2.2.3 Pyramidal structure......................... 48 2.2.4 Pyramidal image of five level decomposition....... 48 2.3.1 Matrix representation Discrete Haar wavelet Transform............................... 50 2.3.2 The goodness of fit of the generalized Gaussian representation of subbands of pyramid image..... 52 2.3.3 Graphical representation between query image and target image database...................... 56 2.4.1 Daubechies-4 Wavelet Transform.............. 59 2.4.2 Matrix representation of Daubechies-4 Wavelet Transform with eight element signals........... 61 2.4.3 Two levels wavelet decomposition of an image using Daubechies-4............................. 61 2.4.4 Block diagram of the proposed system........... 62 2.4.5 3-level wavelet decomposition structure.......... 63 2.4.6 Lena image and its Color Histogram............. 65 2.4.7 Images with the same number of gray levels, but the different spatial distributions................. 68 2.4.8 The process of color feature extraction........... 69 2.4.9 Process of texture feature extraction............ 70 46
2.4.10 The performance curve of the proposed progressive method using Vistex database................ 2.5.1 Wavelet packet decomposition................. 80 2.5.2 Wavelet packet transform of a diabetic retinopathy image decomposition tree level at 2............. 81 2.5.3 Block diagram of proposed system.............. 82 2.5.4 Process of color feature extraction.............. 83 2.5.5 Co-occurrence matrix generation for Ng=4 levels and four different offsets: P H (0 ), (90 PRD ), (45 P V ) and P LD (135 )............................. 85 2.5.6 Process of texture features extraction............ 86 2.5.7 Flowchart of the proposed feature vector......... 88 3.1 Images sampled from (a) VisTex database (b) Corel database (c) Brodatz database................. 92 3.2 Samples of retinopathy fundus images........... 93 3.3 Sample of mammography images............... 94 4.1.1 Examples of some retrieval results from both VisTex and Corel databases........................ 96 4.1.2(a) Graphical representation of the performance comparison of the proposed method with existing CBIR techniques for Corel database............. 99 4.1.2(b) Graphical representation of the performance comparison of the proposed method with existing CBIR techniques for Vistex database............ 4.2.1(a) Performance comparison of the proposed method 75 100 with existing CBIR methods for Corel database.... 103 4.2.1(b) Performance comparison of the proposed method with existing CBIR methods for Vistex database.... 104 4.2.2 The first column represents the query image and neighboring columns are relevant images......... 105 4.3.1 Diabetic Retinopathy Database (DRD)........... 108
4.3.2 Digital database for screening mammography (DDSM).................................. 109 4.3.3(a) Performance comparison of the proposed method with existing CBIR methods for DRD database.... 112 4.3.3(b) Performance comparison of the proposed method 112 with existing CBIR methods for DDSM database...
List of Abbreviations ACR ANMRR ASSET : American College of Radiology : Average Normalized Modified Retrieval Rank : Automatic Search and Selection Engine with Retrieval Tools BDIP BM BVLC CBIR CCV CIE CSS CT DB DBPSP DCT DDSM DHWT DRD DWT ER FE GLCM HMMD HPF : Block Difference of Inverse Probabilities : Bhattacharyya measure : Block Variation of Local Correlation Coefficients : Content Based Image Retrieval : Color Coherent Vector : International Commission on Illumination : Curvature Scale Space : Computed Tomography : Database : Difference Between Pixels of Scan Pattern : Discrete Cosine Transform : Digital Database for Screening Mammography : Discrete Haar Wavelet Transforms : Diabetic Retinopathy Database : Discrete Wavelet Transform : Extended Rectangle : Feature Extraction : Gray-Level Co-occurrence Matrix : Hue, Min, Max, Diff : High Pass Filter
HSV HVS IRMA LBP LCH LPF ML MPEG NHANES : Hue, Saturation, Value color space : Human Visual System : Image Retrieval in Medical Applications : Local Binary Pattern : Local Color Histogram : Low Pass Filter : Maximum Likelihood : Moving Picture Expert Group : The Second National Health And Nutrition Examination Survey PWT QBIC RCWF RGB ROI SM SPIRS VIR WALRUS WBIIS : Pyramid-structured Wavelet Transform : Query By Image Content : Rotated Complex Wavelet Filters : Red, Green, Blue color space : Region of Interest : Similarity Measurement : Spine Pathology & Image Retrieval System : Visual Information Retrieval : WaveLet-based Retrieval of User-specified Scenes : Wavelet Based Image Indexing and Searching WINDSURF : Wavelet-Based Indexing of Images Using Region Fragmentation WPT WWW : Wavelet Packet Transform : World Wide Web
Abstract The advent of the large scale digital image database leads to great challenges in content based image analysis and retrieval. Many researchers have developed a number of techniques to address this problem. Among them, CBIR system attracted the researcher and plays significant role in image retrieval. The CBIR is considered as an active area of research; however, it comprises a strong backdrop for new methodologies and system implementations. Hence, many research contributions focus on these techniques to enable higher image retrieval accuracy while preserving the low-level computational complexity. This thesis proposes a new framework of content-based image retrieval in multiresolution domain based on color and texture features. The contribution of this framework consists of three methods: Haar wavelet based features; multiresolution with statistical features; wavelet based orthogonal polynomial model. Haar wavelet method extracts the features such as spectrum of energy and spatial relationship between the pixels at optimum level of multiresolution pyramid image. Here, Wavelet technique is employed to derive a multiresolution pyramid image. The feature extraction at an optimum level helps the formation of a feature vector. The Bhattacharyya measure (BM) and orthogonal Cosine distance methods are employed on the feature vectors of the query and target images in image database to retrieve the same or similar images. The proposed method is effective and efficient for fast response requirements, since the features extracted at optimum level image contain only fewer dominant wavelet coefficients and these coefficients represent the Gaussian distribution model. The extracted features at optimum level are optimal, because the features represent the images sufficiently with minimum number of features. By which, the storage space can be reduced considerably. The Wavelet transforms with statistical feature method extracts multiresolution based color and texture features at optimum level, in order to improve the performance of the Haar wavelet based retrieval method. The color autocorrelogram features are extracted from the Hue (H) and Saturation(s) components of HSV color space and a set of texture features
are extracted from Value (V) component of HSV color space. These two image features are extracted at optimum level image, based on that the feature vector is formed. Here, Daubechies-4 wavelet transform (DWT) is employed to derive a multiresolution pyramid image. The Minkowski-form distance method is used to find the distance between query image and each target image in the image database. The experimental results show that the proposed system achieves better retrieval accuracy at the optimum level; moreover, the proposed system is very fast with very low computational load. In this study, we have identified that there is no difference between the coarse and fine levels of pyramid image, so we extract the features at fine level instead of coarse level. Since the fine level contains a fewer dominant wavelet coefficients, it reduces the time and space complexity. In the wavelet based orthogonal polynomial method, the low-order and high-order polynomial model applied to low-frequency and high-frequency subbands respectively. Since the wavelet based orthogonal polynomial model spatially localizes the frequency information in wavelet subbands, which is increase the retrieval recognition rate. Wavelet packet transform (WPT) is adopted to construct the 2-level complete multiresolution binary tree. Color features are extracted from low-frequency subband using the color autocorrelogram method, whereas a set of texture features are extracted from high-frequency subband based on co-occurrence matrix. Based on these features, the feature vector is formed and Manhattan distance method is used to find the distance values between the query image and target images in image database. The proposed system has the advantage of increasing the retrieval accuracy and decreasing the retrieval time. In order to evaluate the performance of the proposed method, we construct an image database from various resources, which includes the texture, and structure images and also medical images. The comparisons are carried out with existing methods in terms of precision and recall methods. The proposed system yields better results when compared to that of the existing techniques. Keywords: Optimum level; CBIR, Multiresolution; Wavelet transform; Wavelet packet; Autocorrelogram; GLCM; Orthogonal polynomial.