, pp.182-186 http://dx.doi.org/10.14257/astl.2017.143.38 Improved Accuracy of Spot Search on HPV DNA Microarray Chip Jae-Hong Min 1, Chan-Young Park 2,3, Yu-Seop Kim,2,3, Hye-Jeong Song 3, Jong-Dae Kim 2,3, * 1 Department of Computer Engineering, Hallym University, Korea 2 Department of Convergence Software, Hallym University, Korea 3 Bio-IT Research Center, Hallym University, Korea { woghd6811, hjsong, cypark, yskim01, kimjd }@hallym.ac.kr Abstract. There are two methods to search the spot in the microarray DNA chip: Grid method and Circle template image method. However, these methods have the drawback that they are slow. To solve this problem in a recent study, applying the running algorithm using a box template image reduced the time by 1/10. However, there was a drop in accuracy when comparing accuracy with Grid and Circle template matching methods. Since the size of the box template image used in the previous study is the same as the spot size, the circular spot and the template image do not coincide with each other. Therefore, we show the difference of accuracy by changing the size of the template image. In this paper, we measure the accuracy of the box template image by size and find the best performance. As a result, it seems to be grounds for selecting the most appropriate size box template image according to spot size. Keywords: HPV, DNA Microarray, Image Processing 1 Introduction HPV (Human Papilloma Virus) is a major cause of cervical cancer and is a malignant tumor that affects many women worldwide[1]. The HPV DNA chip was designed to detect one of the major causes of HPV infection. The sensitivity and specificity are very high (95% ~ 99%) and it is an accurate test in time and cost. The chip consists of a probe to detect 22 HPVs and a marker that always reacts. At this time, the probe has the same spot set to increase the reliability of the diagnosis, so that there are 44 spots in total and one more set including 4 markers is arranged [2-4]. HPV DNA chips are automatically analyzed by image processing in Scanners for fast and accurate analysis of results. At this time, the spot on the HPV DNA chip can not be guaranteed to be in the same position all the time on the image. However, it is not easy to find if most of spot of HPV DNA chip is not reacted. In order to solve this problem, it has been reported in previous studies that spot information can be stably found by combining an on-off type spot with a template matching method, The position is searched according to the intensity of the pixel included in the target point using a circular mask under the assumption that the spot is circular and has a constant diameter [3-4]. ISSN: 2287-1233 ASTL Copyright 2017 SERSC
However, in one study, when using a box template image rather than a circle template image, the experiment showed a much improved speed (the box size was circumscribed by spot). Box template image is used to obtain spot correlation. At this time, searching is performed by the running algorithm method. When the distribution of the spot correlation is more than a certain value, it is judged that the spot has reacted [3-4]. However, if the above method is used, the spot image does not match the shape of the template image, resulting in poor search accuracy. In this paper, we propose a method to obtain the highest accuracy when a template image is boxed. And compare with the case using the circle template image, we show the accuracy difference and let it be tolerable error rate. 2 Material and Method 2.1 Method overview In this paper, we use a box template image to find the spot. And we used the circle template image to find the accuracy reference point. The experiment is carried out under the assumption that the spot is circular and all the diameters are matched. The accuracy is measured by varying the size of the box inside the spot until it is circumscribed. Accuracy correlates images using a box template image and obtains its maximum value. Thus, the value of the maximum correlation obtained for each image is set to a specific value to divide the group, and the interclass distance is obtained. In this case, the higher the search accuracy, the more clear the distinction between the background and the spot, and the distribution of the two groups becomes clearer and the interclass distance sum becomes larger. Finally, we find the box template image in the range with the largest distance sum. 2.2 Spot search using template matching In this paper, the size of the spot is 25 and the data is collected while changing the size of the box template image from 17 to 28 in order to see the change of accuracy more clearly. In order to make correlation, we divided the background area and the spot (Object) to designate the inside and outside area. Copyright 2017 SERSC 183
r2 r1 B O b Fig. 1. Box Template Image When searching an image, the intensity of the image pixel is 0 for background and 1 for spot. In this case, if the template image is located in the spot region, the correlation value becomes larger as the spot is included in O, and it can be seen that the spot exists at the position when the value becomes maximum. If the spot is in b, the correlation value decreases by -1 and the value is decreased. Basically, the background with zero intensity does not affect the result. When the sum of the obtained background and the target area is obtained, the difference between the values is obtained and averaged. As a result, the result based on the above formula becomes close to 1 when the template image contains a spot, and is close to 0 in the area without a spot. In this way, when a spot is searched for in the image and a region having a maximum correlation value is found, it is judged that there is a spot. 3 Results Figure 2 shows the average of the maximum values for the interclass distance for the box size. Experimental results show that the size of the box template image is 23 and the interclass distance is the largest. This means that the distinction between the background and the spot has become clear and the accuracy is the highest measured. Figure 3 shows the average of the maximum values for the interclass distance when using the circle template image. Because the size of the spot is 25 and it is the same circle, the accuracy is highest when the size is 25 when using the circle template image. The sum of the interclass distances increased from 3471 to 3634 when using a small box size of 23, which is the same as the spot size of 25, and the sum of 25 Compared with 3859, the difference was 225, indicating that the error rate was reduced about 42% when compared with the previous paper. 184 Copyright 2017 SERSC
Fig. 2. Interclass Distance by Box Size Fig. 3. Interclass Distance by Circle Size 4 Conclusion In this study, we proposed a box template for spot search in HPV DNA chip. In the case of using a box template smaller than the circle template used in the previous study, the accuracy was improved and the speed improvement was confirmed to be about 6 times. Copyright 2017 SERSC 185
Acknowledgments. This research was supported by The Leading Human Resource Training Program of Regional Neoindustry through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT and future Planning(NRF- 2016H1D5A1909654). Reference 1. Schiffman, M., Wentzensen, N., Wacholder, S., Kinney, W., Gage, J. C., Castle, P. E.: Human papillomavirus testing in the prevention of cervical cancer. Journal of the National Cancer Institute, 103(5), 368--383 (2011) 2. An, H. J., Cho, N. H., Lee, S. Y., Kim, I. H., Lee, C., Kim, S. J., Jeong, J. K.: Correlation of cervical carcinoma and precancerous lesions with human papillomavirus (HPV) genotypes detected with the HPV DNA chip microarray method. Cancer. 97.7, 1672--1680 (2003) 3. Kim, J. D., Kim, S. K., Cho, J. S., Kim, J.: Knowledge-based image processing for on-off type DNA microarray. International Symposium on Biomedical Optics. International Society for Optics and Photonics, (2002) 4. Ryu, M., Kim, J. D., Min, B. G., Pang, M. G., Kim, J.: Nonlinear matching measure for the analysis of on-off type DNA microarray images. Journal of biomedical optics. 9.3, 432--438 (2004) 5. Manjunath S. S.: Microarray Image Analysis. Department of studies in computer science university of mysore, Manasagangotri Mysore, India (2012) 6. Brandle, N., Bischof, H., Lapp, H.: A generic and robust approach for the analysis of spot array images. Proc. SPIE. 4266, 1--12 (2001) 186 Copyright 2017 SERSC