Restoration of Motion Blurred Document Images

Restoration of Motion Blurred Document Images Bolan Su 12, Shijian Lu 2 and Tan Chew Lim 1 1 Department of Computer Science,School of Computing,National University of Singapore Computing 1, 13 Computing Drive,Singapore 117417 2 Department of Computer Vision and Image Understanding,Institute for Infocomm Research 1 Fusionopolis Way, #21-01 Connexis,Singapore 138632 {subolan,tancl}@comp.nus.edu.sg slu@i2r.a-star.edu.sg ABSTRACT Motion blur often decreases the quality of document image and makes the text information within the document images unreachable by optical character recognition (OCR) or by a person. This paper presents a blur correction technique that aims to correct motion blur within document images. Given a blurred document image, an alpha channel map is first constructed based on specific image characteristics that are associated with text documents. Several blur parameters including blur direction and blur extent are then estimated from the constructed alpha channel map. Finally the blurred document image is restored by using Richardson-Lucy deconvolution technique based on the estimated blur parameters. Experiments on a number of document images with motion blur show that the proposed technique improves the document visual quality as well as the OCR performance significantly. Categories and Subject Descriptors I.4.3 [Image Processing And Computer Vision]: Enhancement Sharpening and deblurring; I.7.5 [Document and Text Processing]: Document Capture Document analysis General Terms Algorithms Keywords document image; blur identification; motion blur; alpha channel 1. INTRODUCTION One of the most common artifacts in digital photography is image blur. There are two main types of blurring: one is motion blur that is caused by the relative motion between the camera and object during image capturing and the other is defocus blur that is due to the incorrect focal length setting when taking photos. Image Blur induces the degradation of visual quality especially for document images where the text information is easily lost due to blur. Figure 1(a) shows one blurred document image example where the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC 12 March 25-29, 2012, Riva del Garda, Italy. Copyright 2011 ACM 978-1-4503-0857-1/12/03...$10.00. (a) (c) Figure 1: (a) shows one blurry image example caused by motion blur, (c) is the binarization result of (a), (b) shows the restored image of (a), (d) is the binarization result of (b). optical character recognition (OCR) performance will be greatly affected by the image blur. In fact even human cannot understand the text characters in those badly degraded documents and accordingly renders the text information within document unreachable. The binarization result of the example document image in Figure 1(a) loses most of the text information as illustrated in Figure 1(c). The restoration of blurry images is a difficult problem because it is an ill-posed problem, which is reversing an irreversible random spread process. The mathematical linear blur model is defined as follows: (b) (d) B = H I +N (1) where B denotes the blurry image, I denotes the original image, H denotes the point spread function (PSF), which causes the single bright pixel in I spreads over its neighboring pixels by the convolution operator,n denotes the additive noise. Many techniques have been proposed to address this problem that can be broadly classified into two categories; namely, blind deconvolution and non-blind deconvolution. In non-blind deconvolution, the PSF is assumed to be known, only the unblurred original image I needs to be estimated. Weiner filtering [9] and Richardson- Lucy (RL) deconvolution [5] are two of the most widely used classical non-blind restoration methods because of their simplicity and efficiency. However, the PSF is usually unknown in many cases, many techniques [2, 3, 10] try to estimate the PSF before applying the deconvolution procedure, others approaches [11, 12] incorporate more than one images in the deconvolution process to obtain better performance.

(b) (a) (c) (d) Figure 2: The procedure of constructing the alpha channel map. (a) is the input blurry document, and is divided into blocks, which is illustrated by red solid lines, and the overlapped border is illustrated by blue dot lines, (b) shows one block taken from (a), (c) and (d) illustrate the histogram distribution and corresponding Gaussian mixture distribution of (b), respectively. To the best of our knowledge, little work has been reported to deal with the restoration of blurred camera images of documents where the target is to extract the text information from blurred document images. However, as described in Chen et al. s paper [1], the heavy-tailed distribution prior to natural-scene images may not be consistent for document images, the natural-scene image deblurring method based on gradient distribution cannot be directly applied. There are strong edges between the background and text in document images, which may cause strong ringing artifacts after deblurring. So the PSF need to be estimated very accurate. Qi et al. [6] use cepstrum analysis technique for motion blur parameters estimation, but it can only deal with motion blur with a constant acceleration. In this paper, we focus on restoring the blurred image caused by motion. As the motion is usually linear in practice, we model the motion blur as a spatially linear invariant system. A novel document image deblur technique is proposed to automatically enhance the document visual quality and restore the lost text information. The proposed technique first builds an alpha channel map for the input blurred document. Then the blur parameters are calculated using the constructed alpha channel map. The α-motion blur constraint [2] is applied to obtain the blur direction and extent for linear motion blur. Finally, we use the RL method for recovery of blurred documents. For the blurred example document image in Fig. 1a, Figure 1(c) shows the restored document image by using our proposed method and Figure 1(d) shows the binarization result of the restored document image in Figure 1(b) by using established binarization method [8]. The rest of this paper is organized as follows. Section 2 describes the construction of alpha channel map. Section 3 presents the parameters estimation for linear motion blur images. Experimental results and discussions are then reported in Section 4 and some concluding remarks are finally drawn in Section 5. 2. ALPHA CHANNEL MAP The digital image can be considered as a two-layer image composition model [4], an image I is viewed as a combination of an image foreground F and an image background B as follows: I = αf +(1 α)b (2) whereαis between 0 and 1. Most of the values ofαare either 0 or 1 in an unblurred image, because there are sharp boundaries between foreground and background. In a blurred image, foreground and background are mixed together at the boundary areas, so the values of α usually lies between 0 and 1. The spectral matting [4] can be used for automatic extraction of alpha channel, but it is very time consuming. Hence we propose a much faster and simpler way to extract alpha channel for document images. By experiments, our method runs 5 to 10 times faster than spectral matting method to obtain the alpha channel map. We notice that there is already a foreground-background distribution in document images, compared with other digital images. It would be useful to employ this characteristic of document images for alpha channel extraction. In a document image, the text pixels are viewed as foreground and the other pixels are background where the text pixels are usually assumed darker than those background pixels. The intensities of text and background may vary in different regions of document images. As shown in Figure 1(a), the text pixels at the bottom is much darker than those on the top. So it is better to analyze the intensity histogram in a small region of the document instead of the whole image. The overall extraction procedure is shown in Figure 2. The input blurry document is first divided into blocks with border overlapped. The block size is50 50, and border size is10. Then the histogram distribution within a block is clearer to analyze. If there is mostly background or text in the block, the histogram should only have one peak. Otherwise, more than one peak will appear within the histogram distribution with one denoting the text stroke intensity and the other denoting the background intensity. Figure 3 shows examples of these kinds of blocks. The first column shows an image block consists of background only, and its corresponding histogram has only one peak, the second column shows an image block with most of the area are text, and the corresponding histogram has one high peak of text and one small peak of the background, and the third column gives an image block with both text and background area, and the corresponding histogram has two significant peaks. From Equation 2, since the pixel intensity I is known, the alpha value of each pixel can be easily derived given the foreground and background boundary intensity F and B. This is shown in Equation 3. 1 I i <= I fp α i = 0 I i >= I bp else I i I fp I bp I fp where α i denotes the alpha value of a given pixel, I i denotes the intensity of the given pixel, and I fp,i bp denote the boundary intensity of foreground and background, respectively. (3)

SetAlphaValueto1 Document Image Block Corresponding Binary Image Histogram Distribution (a) (b) (c) Number of Gaussian Distribution 1 Background? Yes (d) (e) (f) Figure 3: image block examples taken from Figure 1(a), the first row shows the image blocks, the second row shows the corresponding gray-scale image histogram. 2 Obtain the boundaries intensity Calculate the Alpha Value No SetAlphaValueto0 So the remaining issue is to determine the value of foreground and background. In ideal cases, the foreground and background boundary intensity I fp and I bp in Equation 3 can be directly set as the intensity of the two peaks in a document image block. However, in practice, there is intensity variation within text region and background region, and the background and foreground intensity may shift due to blurry effect. The pixel intensity will expand to both side of the peak intensity, as shown in Figure 2(c). So the foreground boundary intensity should be smaller than the foreground peak value, while the background boundary intensity should be larger than the background peak value. We therefore use Gaussian mixture model to fit the histogram distribution, each peak will be aligned to one Gaussian distribution, as represented in Figure 2(d). If only one Gaussian distribution is derived from the histogram, the region is denoted as pure foreground or background. The alpha value of such block can be set to 0 or 1 by the binarization result of the testing document. If two Gaussian distribution models are obtained, then one is aligned to the background peak, the other is aligned to the foreground peak. Then the Gaussian model is used to control how many image pixels will lie between the foreground and background boundaries, which are determined as follows: I fp = {I i P(I I i) == µ} I bp = {I j P(I I j) == µ} (4) where I fp,i bp denote the background and foreground boundary intensities, respectively, P denotes the possibility of the Gaussian mixture model corresponding to the background and foreground peak, P(I I i) is the possibility of a pixel intensity smaller than a given intensityi i,p(i I j) is the possibility of a pixel intensity larger than a given intensity I j, and µ is a threshold lying between 0 and 1, which controls the number of image pixels between the foreground and background boundaries and is set between 0.005 and0.1 empirically. The overall flowchart of construction of alpha map within a block is shown in Figure 4. The histogram of the testing image block is first extracted. Then the Gaussian mixture model is applied to fit the histogram distribution. If only one Gaussian distribution is obtained, the alpha value of this image block is set to 0 or 1 based on its corresponding binary image. If two Gaussian distributions are Figure 4: The flowchart of alpha channel construction for one image block. generated, the boundary intensity is determined using Equation 4 and the alpha value of this image block is calculated using Equation 3. After all the alpha values of every blocks are calculated, the alpha channel map is created by combining the non-overlapped parts of each blocks, which is shown in Figure 5. 3. RESTORATION OF MOTION BLUR IM- AGE The linear motion blur kernel h can be represented by its direction θ and motion length l in pixels. So we can parametrize h as a vector b = (u,v) T, whereu = lcosθ,v = lsinθ. Dai and Wu [2] proved that the following α-motion blur constraint holds for those α 0: α b = ±1 (5) where α = ( α x, α y) denotes the gradient of the alpha channel of the input blurry document, α x, α y are the gradient value in x and y direction, respectively, and b is a 2 1 blur vector, denoting the blur extend in x and y directions, as described before. Equation 5 can be viewed as a representation of two symmetry lines with respect to the origin on a 2D Axis, where b is the coefficient of the two lines. So as described in Dai and Wu s paper [2], we can project the α values to the 2D( α x, α y) coordinate to form two parallel straight lines on the plane, which is shown in Figure 6(a). The α values can be further projected to Hough space as shown in Figure 6(b). Since there are two possible blur directions, the two salient points correspond to the blur parameters ±b. So to obtain the blur parameter b from equation 5, we need to optimize the following objective function: b est = argmin b p min z=±1 G( αp b z) (6) where b est denotes the estimated blur parameters, p denotes one image pixel, z is either +1 or 1, G( ) is the penalty function which is proportional to estimation error. We adapted Dai and Wu s

Figure 5: the alpha channel map constructed from Figure 1(a). (a) α distribution (b) Hough domain method [2] to obtain the blur parameters. For the blurry image in Figure 1(a), we estimated the blur parameter as[±0.6798, 20.4030], and the blur parameter is estimated as [±0.0300, 17.3603] using the alpha channel created by spectral matting [4]. Compared with Levin et. al. s method [4], our proposed estimation is closer to the truth blur parameters, which is [0,20]. Then we use Shan et al. s non-blind deconvolution method [7] to obtain the restored image. 4. EXPERIMENT AND DISCUSSION We collect 30 document images from different sources, including name cards, book covers, posters, sign board and so on. These images are motion blurred under different direction and extent, half of them are taken naturally through digital camera and the others are blurred synthetically. First we compare the blur identification accuracy of our proposed method with cepstrum analysis method [6] using the synthetically blurred images. The average least square error of our proposed method is 0.1990, which is more precise than 2.7080 given by Qi et al. s method [6]. Figure 7 shows two restored image examples obtained these two methods. Compared with Qi et al. s method, our proposed technique produces much better results, which restore most details of the original images. We also compare our method with Li et al. s method [10]. The results are shown in Figure 8. As Figure 8 shows, the visual qualities of our proposed technique are much better than the other two methods. We then use the free google OCR engine 1 to recognize those testing examples to verify our proposed method. The overall recognition recall and precision of the testing images are around 10% and 15% before restoration, and increase to around 60% and 70% after restoration, respectively. There are some ringing artifacts appeared near the strong edges in the recovered images, which decrease the visual quality of the document images. In the future, we will work on this issue and try to reduce the ringing artifacts in our recovered images. 5. CONCLUSION In this paper, we propose a document image deblur technique that automatically recovers the blurry document caused by motion. We adapt the two-layer image composition model, and construct the alpha channel map based on block-wised histogram analysis. Then the alpha motion blur constraint is used to identify the extent and direction of the motion blur. Finally the image is restored by Shan s method [7] along with the calculated parameters. Experimental results on linear motion blurred images show that our method can improve the document quality and accessibility of the text information. It can be used as a pre-processing stage for many document analysis applications. In future, we will try to extend our 1 http://code.google.com/p/tesseract-ocr/ Figure 6: The α distribution on 2D ( α x, α y) coordinate and Hough domain, the origin is in the center. method to deal with different kinds of blur and reduce the ringing artifacts in our recovered images. 6. REFERENCES [1] X. Chen, X. He, J. Yang, and Q. Wu. An effective document image deblurring algorithm. IEEE Conference on Computer Vision and Pattern Recognition, 2011. [2] S. Dai and Y. Wu. Motion from blur. IEEE Conference on Computer Vision and Pattern Recognition, 2008. [3] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. T. Freeman. Removing camera shake from a single photograph. ACM Transaction on Graphics, 2006. [4] A. Levin, A. Rav-Acha, and D. Lischinski. Spectral matting. IEEE Conference on Computer Vision and Pattern Recognition, 2007. [5] L. Lucy. An iterative technique for rectification of observed distribution. Astronomy Journal, 79, 1974. [6] X. Y. Qi, L. Zhang, and C. L. Tan. Motion deblurring for optical character recognition. International Conference on Document Analysis and Recognition, 1:389 393, August 2005. [7] Q. Shan, J. Jia, and A. Agarwala. High-quality motion deblurring from a single image. ACM Transactions on Graphics, 27(3):73, 2008. [8] B. Su, S. Lu, and C. L. Tan. Binarization of historical document images using the local maximum and minimum. International Workshop on Document Analysis Systems, pages 159 166, 2010. [9] N. Wiener. Extrapolation, interpolation, and smoothing of stationary time series. MIT Press, 1964. [10] L. Xu and J. Jia. Two-phase kernel estimation for robust motion deblurring. European Conference on Computer Vision, pages 157 170, 2010. [11] L. Yuan, J. Sun, L. Quan, and H.-Y. Shum. Image deblurring with blurred/noisy image pairs. ACM Transactions on Graphics, 26(3):1, 2007. [12] S. Zhuo, D. Guo, and T. Sim. Robust flash deblurring. IEEE Conference on Computer Vision and Pattern Recognition, 2010.

Figure 7: The first column is the blurred images, the second column is the corresponding recovered images by cepstrum method, the third column is the corresponding recovered images by proposed method, the last column is the origin clear images. Figure 8: Four motion blurred document image examples in the first column and corresponding recovered images by our proposed method in the second column, Shan et al. s method [7] in the third column and Qi et al. s method [6] in the fourth column, respectively.