On the Recovery of Depth from a Single Defocused Image

Similar documents
Defocus Map Estimation from a Single Image

Pattern Recognition 44 (2011) Contents lists available at ScienceDirect. Pattern Recognition. journal homepage:

Single Digital Image Multi-focusing Using Point to Point Blur Model Based Depth Estimation

A moment-preserving approach for depth from defocus

Coded Aperture for Projector and Camera for Robust 3D measurement


Edge Width Estimation for Defocus Map from a Single Image

Computational Cameras. Rahul Raguram COMP

Dappled Photography: Mask Enhanced Cameras for Heterodyned Light Fields and Coded Aperture Refocusing

Coded Aperture Pairs for Depth from Defocus

Implementation of Adaptive Coded Aperture Imaging using a Digital Micro-Mirror Device for Defocus Deblurring

Coded Computational Photography!

Evolving Measurement Regions for Depth from Defocus

Optimal Camera Parameters for Depth from Defocus

Exact Blur Measure Outperforms Conventional Learned Features for Depth Finding

uncorrected proof Fast depth from defocus from focal stacks Author Proof Stephen W. Bailey Jose I. Echevarria Bobby Bodenheimer Diego Gutierrez

A Mathematical model for the determination of distance of an object in a 2D image

Depth from Diffusion

Coded Aperture Flow. Anita Sellent and Paolo Favaro

Coded Aperture Imaging

Toward Non-stationary Blind Image Deblurring: Models and Techniques

Single-Image Shape from Defocus

Image Deblurring with Blurred/Noisy Image Pairs

Deconvolution , , Computational Photography Fall 2017, Lecture 17

Restoration of Motion Blurred Document Images

Deconvolution , , Computational Photography Fall 2018, Lecture 12

SURVEILLANCE SYSTEMS WITH AUTOMATIC RESTORATION OF LINEAR MOTION AND OUT-OF-FOCUS BLURRED IMAGES. Received August 2008; accepted October 2008

Three dimensional moving pictures with a single imager and microfluidic lens

Non-Uniform Motion Blur For Face Recognition

Computational Approaches to Cameras

Midterm Examination CS 534: Computational Photography

Declaration. Michal Šorel March 2007

Simulated Programmable Apertures with Lytro

Performance Evaluation of Different Depth From Defocus (DFD) Techniques

Computational approach for depth from defocus

NTU CSIE. Advisor: Wu Ja Ling, Ph.D.

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis

Light-Field Database Creation and Depth Estimation

Distance Estimation with a Two or Three Aperture SLR Digital Camera

Selection of Temporally Dithered Codes for Increasing Virtual Depth of Field in Structured Light Systems

Accelerating defocus blur magnification

Image and Depth from a Single Defocused Image Using Coded Aperture Photography

Defocusing and Deblurring by Using with Fourier Transfer

fast blur removal for wearable QR code scanners

Recent Advances in Image Deblurring. Seungyong Lee (Collaboration w/ Sunghyun Cho)

Position-Dependent Defocus Processing for Acoustic Holography Images

Depth from Focusing and Defocusing. Carnegie Mellon University. Pittsburgh, PA result is 1.3% RMS error in terms of distance

Deblurring. Basics, Problem definition and variants

DEFOCUS BLUR PARAMETER ESTIMATION TECHNIQUE

Coded photography , , Computational Photography Fall 2018, Lecture 14

Constrained Unsharp Masking for Image Enhancement

Improved motion invariant imaging with time varying shutter functions

Hand segmentation using a chromatic 3D camera

Focused Image Recovery from Two Defocused

Depth Estimation Algorithm for Color Coded Aperture Camera

A Recognition of License Plate Images from Fast Moving Vehicles Using Blur Kernel Estimation

Correcting Over-Exposure in Photographs

6.A44 Computational Photography

Computer Vision. Howie Choset Introduction to Robotics

A Review over Different Blur Detection Techniques in Image Processing

Fast Blur Removal for Wearable QR Code Scanners (supplemental material)

Point Spread Function Engineering for Scene Recovery. Changyin Zhou

Design of Temporally Dithered Codes for Increased Depth of Field in Structured Light Systems

Blind Single-Image Super Resolution Reconstruction with Defocus Blur

LENSLESS IMAGING BY COMPRESSIVE SENSING

The ultimate camera. Computational Photography. Creating the ultimate camera. The ultimate camera. What does it do?

Coded photography , , Computational Photography Fall 2017, Lecture 18

What are Good Apertures for Defocus Deblurring?

Extended depth of field for visual measurement systems with depth-invariant magnification

Image Processing for feature extraction

4 STUDY OF DEBLURRING TECHNIQUES FOR RESTORED MOTION BLURRED IMAGES

SHAPE FROM FOCUS. Keywords defocus, focus operator, focus measure function, depth estimation, roughness and tecture, automatic shapefromfocus.

A Novel Image Deblurring Method to Improve Iris Recognition Accuracy

Modeling and Synthesis of Aperture Effects in Cameras

Blur Estimation for Barcode Recognition in Out-of-Focus Images

Detail Recovery for Single-image Defocus Blur

Supplementary Material of

Method for out-of-focus camera calibration

Refocusing Phase Contrast Microscopy Images

CS6670: Computer Vision Noah Snavely. Administrivia. Administrivia. Reading. Last time: Convolution. Last time: Cross correlation 9/8/2009

To Denoise or Deblur: Parameter Optimization for Imaging Systems

Realistic Image Synthesis

Sharpness Metric Based on Line Local Binary Patterns and a Robust segmentation Algorithm for Defocus Blur

Frequency Domain Enhancement

Multi Focus Structured Light for Recovering Scene Shape and Global Illumination

Understanding camera trade-offs through a Bayesian analysis of light field projections - A revision Anat Levin, William Freeman, and Fredo Durand

DEPTH FUSED FROM INTENSITY RANGE AND BLUR ESTIMATION FOR LIGHT-FIELD CAMERAS. Yatong Xu, Xin Jin and Qionghai Dai

CS534 Introduction to Computer Vision. Linear Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University

Image Denoising using Dark Frames

1.Discuss the frequency domain techniques of image enhancement in detail.

Single Camera Catadioptric Stereo System

Coded Aperture and Coded Exposure Photography

Vision Review: Image Processing. Course web page:

To Do. Advanced Computer Graphics. Outline. Computational Imaging. How do we see the world? Pinhole camera

OFFSET AND NOISE COMPENSATION

Single-shot three-dimensional imaging of dilute atomic clouds

Prof. Feng Liu. Winter /10/2019

Computational Camera & Photography: Coded Imaging

Extended depth-of-field in Integral Imaging by depth-dependent deconvolution

Performance Evaluation of Edge Detection Techniques for Square Pixel and Hexagon Pixel images

Transcription:

On the Recovery of Depth from a Single Defocused Image Shaojie Zhuo and Terence Sim School of Computing National University of Singapore Singapore,747 Abstract. In this paper we address the challenging problem of recovering the depth of a scene from a single image using defocus cue. To achieve this, we first present a novel approach to estimate the amount of spatially varying defocus blur at edge locations. We re-blur the input image and show that the gradient magnitude ratio between the input and re-blurred images depends only on the amount of defocus blur. Thus, the blur amount can be obtained from the ratio. A layered depth map is then extracted by propagating the blur amount at edge locations to the entire image. Experimental results on synthetic and real images demonstrate the effectiveness of our method in providing a reliable estimate of the depth of a scene. Keywords: Image processing, depth recovery, defocus blur, Gaussian gradient, markov random field. Introduction Depth recovery plays an important role in computer vision and computer graphics with applications such as robotics, D reconstruction or image refocusing. In principle depth can be recovered either from monocular cues (shading, shape, texture, motion etc.) or from binocular cues (stereo correspondences). Conventional methods for estimating the depth of a scene have relied on multiple images. Stereo vision [,] measures disparities between a pair of images of the same scene taken from two different viewpoints and uses the disparities to recover the depth. Structure from motion (SFM) [,4] computes the correspondences between images to obtain the D motion field. The D motion field is used to recover the D motion and the depth. Depth from focus (DFF) [5,6] captures a set of images using multiple focus settings and measures the sharpness of image at each pixel locations. The sharpest pixel is selected to form a all-in-focus image and the depth of the pixel depends on which image the pixel is selected from. Depth from defocus (DFD) [7,8] requires a pair of images of the same scene with different focus setting. It estimates the degree of defocus blur and the depth of scene can be recovered providing the camera setting. These methods either suffer from the occlusion problem or can not be applied to dynamic scenes. X. Jiang and N. Petkov (Eds.): CAIP 9, LNCS 57, pp. 889 897, 9. c Springer-Verlag Berlin Heidelberg 9

89 S. Zhuo and T. Sim.5.5.5 (a) (b) Fig.. The depth recovery result of the book image. (a) The input defocused image. (b) Recovered layered depth map. The larger intensity means larger blur amount and depth in all the depth maps presented in this paper. Recently, approaches have been proposed to recover depth from a single image in very specific settings. Several methods [9,] use active illumination to aid depth recovery by projecting structured patterns onto the scene. The depth is measured by the attenuation of the projected light or the deformation of the projected pattern. The coded aperture method [] changes the shape of defocus blur kernel by inserting a customized mask into the camera lens, which makes the blur kernel more sensitive to depth variation. The depth is determined after a deconvolution process using a set of calibrated blur kernels. Saxena et al. [] collect a training set of monocular images and their corresponding ground-truth depth maps and apply supervised learning to predict the value of the depth map as a function of the input image. In this paper we focus on a more challenging problem of recovering the depth layers from a single defocused image captured by an uncalibrated conventional camera. As the most related work, the inverse diffusion method [], which models the defocus blur as a diffusion process, uses the inhomogeneous reverse heat equation to obtain an estimate of the blur at edge locations and then proposed a graph-cut based method for inferring the depth in the scene. In contrast, we model the defocus blur as a D Gaussian blur. The input image is re-blurred using a known Gaussian function and the gradient magnitude ratio between input and re-blurred images is calculated. Then the blur amount at edge locations can be derived from the ratio. We also construct a MRF to propagate the blur estimate from the edge location to the entire image and finally obtain a layered depth map of the scene. Our work has three main contributions. Firstly, we propose an efficient blur estimation method based on the gradient magnitude ratio, and we will show that our method is robust to noise, inaccurate edge location and interference from near edges. Secondly, without any modification to the camera or using additional illumination, our blur estimation method combined with MRF optimization can obtain the depth map of a scene by using only single defocused image captured by conventional camera. As shown in Fig., our method can extract a layered depth map of the scene with fairly good extent of accuracy. Finally, we discuss

On the Recovery of Depth from a Single Defocused Image 89 two kinds of ambiguities in recovering depth from a single image using defocus cue, one of which is usually overlooked by previous methods. Defocus Model As the amount of defocus blur is estimated at edge locations, we must model the edge first. We adopt the ideal step edge model which is f(x) =Au(x)+B, () where u(x) is the step function. A and B are the amplitude and offset of the edge respectively. Note that the edge is located at x =. When an object is placed at the focus distance d f, all the rays from a point of the object will converge to a single sensor point and the image will appear sharp. Rays from a point of another object at distance d will reach multiple sensor points and result in a blurred image. The blurred pattern depends on the shape of aperture and is often called the circle of confusion (CoC) [4]. The diameter of CoC characterizes the amount of defocus and can be written as c = d d f d f N (d f f ), () where f and N are the focal length and the stop number of the camera respectively. Fig. shows a thin lens model and how the diameter of circle of confusion changes with d and N, given fixed f and d f. As we can see, the diameter of the CoC c is a non-linear monotonically increasing function of the object distance d. The defocus blur can be modeled as the convolution of a sharp image with Focal plane d d f (a) Lens f Image sensor c Diameter of CoC c (mm).8.6.4. N = N =4 N =8 5 5 Object distance d (mm) (b) Fig.. (a) A thin lens model. (b) The diameter of CoC c as a function of the object distance d andf-stopnumbern given d f = 5mm, f =8mm. the point spread function (PSF). The PSF can be approximated by a Gaussian function g(x, σ), where the standard deviation σ = kc is proportional to the diameter of the CoC c. Weuseσ as a measure of the depth of the scene. A blurred edge i(x) can be represented as follows, i(x) =f(x) g(x, σ). ()

89 S. Zhuo and T. Sim σ i i blur amount blurred edge re-blurred edges gradients gradient ratio Fig.. Our blur estimation approach: here, and are the convolution and gradient operators respectively. The black dash line denotes the edge location. Blur Estimation Fig. shows the overview of our local blur estimation method. A step edge is re-blurred using a Gaussian function with know standard deviation. Then the ratio between the gradient magnitude of the step edge and its re-blurred version is calculated. The ratio is maximum at the edge location. Using the maximum value, we can compute the amount of the defocus blur of an edge. For convenience, we describe our blur estimation algorithm for D case first and then extend it to D image. The gradient of the re-blurred edge is: i (x) = ( i(x) g(x, σ ) ) = ( (Au(x)+B) g(x, σ) g(x, σ ) ) = A π(σ + σ ) exp( x (σ + σ ) ), where σ is the standard deviation of the re-blur Gaussian function. We call it the re-blur scale. The gradient magnitude ratio between the original and re-blurred edges is i(x) i = σ + σ exp( x (x) σ σ x (σ + σ ) ). (5) It can be proved that the ratio is maximum at the edge location (x =).The maximum value is given by R = i() i = σ + σ. (6) () σ Giving the insight on (4) and (6), we notice that the edge gradient depends on both the edge amplitude A and blur amount σ, while the maximum of the gradient magnitude ratio R eliminates the effect of edge amplitude A and depends only on σ and σ. Thus, given the maximum value R, we can calculate the unknown blur amount σ using σ = (4) σ. (7) R For blur estimation in D images, we use D isotropic Gaussian function to perform re-blur. As any direction of a D isotropic Gaussian function is a D

On the Recovery of Depth from a Single Defocused Image 89 Gaussian, the blur estimation is similar to that in D case. In D image, the gradient magnitude can be computed as follows: i(x, y) = i x + i y (8) where i x and i y are the gradients along x and y directions respectively. 4 Layered Depth Map Extraction After we obtain the depth estimates at edge locations, we need to propagate the depth estimates from edge locations to other regions that do not contain edges. We seek a regularized depth labeling ˆσ which is smooth and close to the estimation in Eq. (7). We also prefer the depth discontinuities to be aligned with the image edges. Thus, We formulate this as a energy minimization over the discrete Markov Random Field (MRF) whose energy is given by E(ˆσ) = V i(ˆσ i)+λ V ij(ˆσ i, ˆσ j). (9) i i j N (i) where each pixel in the image is a node of the MRF and λ balance the single node potential V i (ˆσ i ) and pairwise potential V ij (ˆσ i, ˆσ j ) which are defined as V i(ˆσ i)=m(i)(σ i ˆσ i), () V ij(ˆσ i, ˆσ j)= j N (i) w ij(ˆσ i ˆσ j), () where M( ) is a binary mask with non-zeros only at edge locations. the weight w ij = exp{ (I(i) I(j)) } encodes the difference of neighboring colors I(i)and I(j). 8-neighborhood system N (i) is adopted in our definition. We use FastPD [5] to minimized the MRF energy defined in Eq. (9). FastPD can guarantee a approximately optimal solution and is much faster than previous MRF optimization methods such as conventional graph cut techniques. 5 Experiments There are two parameters in our method: the re-blur scale σ and the λ. Weset σ =,λ =, which gives good results in all our examples. We use Canny edge detector [6] and tune its parameters to obtain desired edge detection output. The depth map are actually the estimated σ values at each pixel. We first test the performance of our method on the synthetic bar image shown in Fig. 4(a). The blur amount of the edge increases linearly from to 5. We first add noises to the bar image. Under noise condition, although the result of edges with larger blur amount is more affected by noise, our method can still achieve reliable estimation result (see Fig. 4(b)). We then create more bar images with different edge distances. Fig. 4(c) shows that interferences from neighboring edges increase estimation errors when the blur amount is large (> ), but the

894 S. Zhuo and T. Sim 5 4 no noise var =. var =. 5 4 dst = dst =5 dst = 5 4 shift = shift = shift = 4 5 4 5 4 5 (a) (b) (c) (d) Fig. 4. Performance of our blur estimation method. (a) The synthetic image with blur edges. (b) Estimation errors under Gaussian noise condition. (c) Estimation errors with edge distances of, 5 and pixels. (d) Estimation errors with edge shifts of, and pixels.the x and y axes are the blur amount and corresponding estimation error..5.5.5.5.5.5 (a) (b) (c) Fig. 5. The depth recovery results of flower and building images. (a) The input defocused images. (b) The sparse blur maps. (c) The final layered depth maps. errors are controlled in a relative low level. Furthermore, we shift the detected edges to simulate inaccurate edge location and test our method. The result is shown in Fig. 4(d). When the edge is sharp, the shift of edge locations causes quite large estimation errors. However, in practice, the sharp edges usually can be located very accurately, which greatly reduces the estimation error.

On the Recovery of Depth from a Single Defocused Image 895 (a) (b) (c) Fig. 6. Comparison of our method and the inverse diffusion method. (a) The input image. (b) The result of inverse diffusion method. (c) Our result. The image is from []..5.5.5 (a) (b) Fig. 7. The depth recovery result of the photo frame image. (a) The input defocused image. (b) Recovered layered depth map. As show in Fig. 5, we test our method on some real images. In the flower image, the depth of the scene changes continuously from the bottom to the top of the image. The sparse blur map gives a reasonable measure of the blur amount at edge locations. The depth map reflects the continuous change of the depth. In the building image, there are mainly depth layers in the scene: the wall in the nearest layer, the buildings in the middle layer, and the sky in the farthest layer. Our method extracts these three layers quite accurately and produces the depth map shown in Fig. 5(c). Both of the results are obtained using labels of depth with the blur amount from to. One more example is the book image shown in Fig.. The result is obtain using 6 depth labels with blur amount from to. As we can see from the recovered depth map, our method is able to obtain a good estimate of the depth of the scene from a single image. In Fig. 6, we compare our method with the inverse diffusion method []. Both methods generate reasonable layered depth maps. However, our method has higher accuracy in local estimation and thus, our depth map captures more details of the depth. As shown in the figure, the difference in the depth of the left and right arms can be perceived in our result. In contrast, the inverse diffusion method does not recover this depth difference.

896 S. Zhuo and T. Sim 6 Ambiguities in Depth Recovery There are two kinds of ambiguities in depth recovery from single image using defocus cue. The first one is the focal plane ambiguity. When an object appears blur in the image, it can be on either side of the focal plane. To remove this ambiguity, most of the depth from defocus methods including our method assume all objects of interest are located on one side of the focal plane. When taking images, we just put the focus point on the nearest/farthest point in the scene. The second ambiguity is called the blur/sharp edge ambiguity. The defocus measure we obtained may be due to a sharp edge that is out of focus or a blur edge that is in focus. This ambiguity is often overlooked by previous work and may cause some artifacts in our result. One example is shown in Fig. 7. The region indicated by the white rectangle is actually blur texture of the photo in the frame, but our method treats it as sharp edges due to defocus blur, which results in error estimation of the depth in that region. 7 Conclusion In this paper, we show that the depth of a scene can be recovered from a single defocused image. A new method is presented to estimate the blur amount at edge locations based on the gradient magnitude ratio. The layered depth map is then extracted using MRF optimization. We show that our method is robust to noise, inaccurate edge location and interferences of neighboring edges and can generate more accurate scene depth maps compared with existing methods. We also discuss ambiguities arising in recovering depth from single images using defocus cue. In the future, we would like to apply our blur estimation method to images with motion blur to estimate the blur kernels. Acknowledgement. The author would like to thank the anonymous reviewers for their helpful suggestions. The work is supported by NUS Research Grant #R-5--8-. References. Barnard, S., Fischler, M.: Computational stereo. ACM Comput. Surv. 4(4), 55 57 (98). Dhond, U., Aggarwal, J.: Structure from stereo: A review. IEEE Trans. Syst. Man Cybern. 9(6), 489 5 (989). Dellaert, F., Seitz, S.M., Thorpe, C.E., Thrun, S.: Structure from motion without correspondence. In: Proc. CVPR, pp. 557 564 () 4. Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: A factorization method. Int. J. Comput. Vision 9, 7 54 (99) 5. Asada, N., Fujiwara, H., Matsuyama, T.: Edge and depth from focus. Int. J. Comput. Vision 6(), 5 6 (998) 6. Nayar, S., Nakagawa, Y.: Shape from focus. IEEE Trans. Pattern Anal. Mach. Intell. 6(8), 84 8 (994)

On the Recovery of Depth from a Single Defocused Image 897 7. Favaro, P., Favaro, P., Soatto, S.: A geometric approach to shape from defocus. IEEE Trans. Pattern Anal. Mach. Intell. 7(), 46 47 (5) 8. Pentland, A.P.: A new sense for depth of field. IEEE Trans. Pattern Anal. Mach. Intell. 9(4), 5 5 (987) 9. Moreno-Noguer, F., Belhumeur, P.N., Nayar, S.K.: Active refocusing of images and videos. ACM Trans. Graphics, 67 (7). Nayar, S.K., Watanabe, M., Noguchi, M.: Real-time focus range sensor. IEEE Trans. Pattern Anal. Mach. Intell. 8(), 86 98 (996). Levin, A., Fergus, R., Durand, F., Freeman, W.T.: Image and depth from a conventional camera with a coded aperture. ACM Trans. Graphics (7). Saxena, A., Sun, M., Ng, A.: Maked: Learning d scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell., (8). Namboodiri, V.P., Chaudhuri, S.: Recovery of relative depth from a single observation using an uncalibrated (real-aperture) camera. In: Proc. CVPR (8) 4. Hecht, E.: Optics, 4th edn. Addison-Wesley, Reading () 5. Komodakis, N., Tziritas, G., Paragios, N.: Performance vs computational efficiency for optimizing single and dynamic mrfs: Setting the state of the art with primaldual strategies. Proc. CVIU (), 4 9 (8) 6. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679 698 (986)