arxiv: v2 [cs.cv] 31 Jul 2017

Similar documents
Modeling the calibration pipeline of the Lytro camera for high quality light-field image reconstruction

Dictionary Learning based Color Demosaicing for Plenoptic Cameras

LIGHT FIELD (LF) imaging [2] has recently come into

Accurate Disparity Estimation for Plenoptic Images

DEPTH FUSED FROM INTENSITY RANGE AND BLUR ESTIMATION FOR LIGHT-FIELD CAMERAS. Yatong Xu, Xin Jin and Qionghai Dai

Light-Field Database Creation and Depth Estimation

Lecture 18: Light field cameras. (plenoptic cameras) Visual Computing Systems CMU , Fall 2013

Robust Light Field Depth Estimation for Noisy Scene with Occlusion

Wavefront coding. Refocusing & Light Fields. Wavefront coding. Final projects. Is depth of field a blur? Frédo Durand Bill Freeman MIT - EECS

Dappled Photography: Mask Enhanced Cameras for Heterodyned Light Fields and Coded Aperture Refocusing

To Do. Advanced Computer Graphics. Outline. Computational Imaging. How do we see the world? Pinhole camera

Full Resolution Lightfield Rendering

Multi-view Image Restoration From Plenoptic Raw Images

Simulated Programmable Apertures with Lytro

Computational Photography

Li, Y., Olsson, R., Sjöström, M. (2018) An analysis of demosaicing for plenoptic capture based on ray optics In: Proceedings of 3DTV Conference 2018

Coded Computational Photography!

arxiv: v2 [cs.gr] 7 Dec 2015

Admin. Lightfields. Overview. Overview 5/13/2008. Idea. Projects due by the end of today. Lecture 13. Lightfield representation of a scene

Light field sensing. Marc Levoy. Computer Science Department Stanford University

Ultra-shallow DoF imaging using faced paraboloidal mirrors

Introduction to Light Fields

Coded photography , , Computational Photography Fall 2018, Lecture 14

Time-Lapse Light Field Photography With a 7 DoF Arm

Coded Aperture and Coded Exposure Photography

Principles of Light Field Imaging: Briefly revisiting 25 years of research

Computational Camera & Photography: Coded Imaging


Computational Cameras. Rahul Raguram COMP

Coded photography , , Computational Photography Fall 2017, Lecture 18

Single-shot three-dimensional imaging of dilute atomic clouds

Capturing Light. The Light Field. Grayscale Snapshot 12/1/16. P(q, f)

Implementation of Adaptive Coded Aperture Imaging using a Digital Micro-Mirror Device for Defocus Deblurring

Computational Photography Introduction

Aliasing Detection and Reduction in Plenoptic Imaging

Computational Approaches to Cameras

The ultimate camera. Computational Photography. Creating the ultimate camera. The ultimate camera. What does it do?

Coding and Modulation in Cameras

Less Is More: Coded Computational Photography

Deblurring. Basics, Problem definition and variants

Distance Estimation with a Two or Three Aperture SLR Digital Camera

Guided Filtering Using Reflected IR Image for Improving Quality of Depth Image

Burst Photography! EE367/CS448I: Computational Imaging and Display! stanford.edu/class/ee367! Lecture 7! Gordon Wetzstein! Stanford University!

A Review over Different Blur Detection Techniques in Image Processing

Lytro camera technology: theory, algorithms, performance analysis

Denoising and Effective Contrast Enhancement for Dynamic Range Mapping

Resolution Preserving Light Field Photography Using Overcomplete Dictionaries And Incoherent Projections

Computational Photography: Principles and Practice

Agenda. Fusion and Reconstruction. Image Fusion & Reconstruction. Image Fusion & Reconstruction. Dr. Yossi Rubner.

SUPER RESOLUTION INTRODUCTION

Light field photography and microscopy

Demosaicing and Denoising on Simulated Light Field Images

Coded Aperture for Projector and Camera for Robust 3D measurement

A moment-preserving approach for depth from defocus

Blind Single-Image Super Resolution Reconstruction with Defocus Blur

A survey of Super resolution Techniques

Introduction , , Computational Photography Fall 2018, Lecture 1

Modeling and Synthesis of Aperture Effects in Cameras

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising

La photographie numérique. Frank NIELSEN Lundi 7 Juin 2010

Decoding, Calibration and Rectification for Lenselet-Based Plenoptic Cameras

Depth from Combining Defocus and Correspondence Using Light-Field Cameras

Real Time Focusing and Directional Light Projection Method for Medical Endoscope Video

Understanding camera trade-offs through a Bayesian analysis of light field projections Anat Levin, William T. Freeman, and Fredo Durand

fast blur removal for wearable QR code scanners

Single Camera Catadioptric Stereo System

Dynamically Reparameterized Light Fields & Fourier Slice Photography. Oliver Barth, 2009 Max Planck Institute Saarbrücken

International Journal of Innovative Research in Engineering Science and Technology APRIL 2018 ISSN X

Extended depth-of-field in Integral Imaging by depth-dependent deconvolution

Enhanced DCT Interpolation for better 2D Image Up-sampling

CS6670: Computer Vision Noah Snavely. Administrivia. Administrivia. Reading. Last time: Convolution. Last time: Cross correlation 9/8/2009

Coded Computational Imaging: Light Fields and Applications

KAUSHIK MITRA CURRENT POSITION. Assistant Professor at Department of Electrical Engineering, Indian Institute of Technology Madras, Chennai.

Active Aperture Control and Sensor Modulation for Flexible Imaging

Embedded FIR filter Design for Real-Time Refocusing Using a Standard Plenoptic Video Camera

Bilayer Blind Deconvolution with the Light Field Camera

MAS.963 Special Topics: Computational Camera and Photography

A Mathematical model for the determination of distance of an object in a 2D image

Short-course Compressive Sensing of Videos

MIT CSAIL Advances in Computer Vision Fall Problem Set 6: Anaglyph Camera Obscura

Implementation of Image Deblurring Techniques in Java

Compressive Light Field Imaging

CSC 320 H1S CSC320 Exam Study Guide (Last updated: April 2, 2015) Winter 2015

Angle Sensitive Imaging: A New Paradigm for Light Field Imaging

Supplementary Material of

Supplementary Information

NTU CSIE. Advisor: Wu Ja Ling, Ph.D.

Defocus Map Estimation from a Single Image

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION

Hexagonal Liquid Crystal Micro-Lens Array with Fast-Response Time for Enhancing Depth of Light Field Microscopy

Understanding camera trade-offs through a Bayesian analysis of light field projections - A revision Anat Levin, William Freeman, and Fredo Durand

Microlens Image Sparse Modelling for Lossless Compression of Plenoptic Camera Sensor Images

Perception. Introduction to HRI Simmons & Nourbakhsh Spring 2015

THE 4D light field camera is a promising potential technology

Point Spread Function Engineering for Scene Recovery. Changyin Zhou

An Adaptive Kernel-Growing Median Filter for High Noise Images. Jacob Laurel. Birmingham, AL, USA. Birmingham, AL, USA

Video-rate computational super-resolution and light-field integral imaging at longwaveinfrared

Removing Temporal Stationary Blur in Route Panoramas

Removal of Glare Caused by Water Droplets

Non-Uniform Motion Blur For Face Recognition

Transcription:

Noname manuscript No. (will be inserted by the editor) Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range M. Zeshan Alam Bahadir K. Gunturk arxiv:1611.05008v2 [cs.cv] 31 Jul 2017 Received: date / Accepted: date Abstract Light field imaging involves capturing both angular and spatial distribution of light; it enables new capabilities, such as post-capture digital refocusing, camera aperture adjustment, perspective shift, and depth estimation. Micro-lens array (MLA) based light field cameras provide a cost-effective approach to light field imaging. There are two main limitations of MLA-based light field cameras: low spatial resolution and narrow baseline. While low spatial resolution limits the general purpose use and applicability of light field cameras, narrow baseline limits the depth estimation range and accuracy. In this paper, we present a hybrid stereo imaging system that includes a light field camera and a regular camera. The hybrid system addresses both spatial resolution and narrow baseline issues of the MLA-based light field cameras while preserving light field imaging capabilities. Keywords Light field imaging hybrid stereo imaging 1 Introduction A light field can be defined as the collection of all light rays in 3D space [14, 20]. One of the earliest implementations of a light field camera was presented [21], where a micro-lens array is placed in front of a film to capture incident light amount from different directions. While a light field, in general, can be parameterized in terms of 3D coordinates of ray positions, 2D ray directions, and physical properties of light (such as This work is supported by TUBITAK Grant 114E095. The authors are with Istanbul Medipol University, Istanbul, Turkey E-mail: mzalam@st.medipol.edu.tr E-mail: bkgunturk@medipol.edu.tr wavelength and polarization), the independent parameters can be reduced to a four-dimensional space assuming there is no energy loss during light propagation and when only the intensity of light is considered; such a four-dimensional representation of light field is used in many practical applications [3, 20, 15]. Unlike regular cameras, light field cameras capture the directional light information, which enables new capabilities, including post-capture adjustment of camera parameters (such as focal length and aperture size), post-capture change of camera viewpoint, and depth estimation. As a result, light field imaging is getting increasingly used in a variety of application areas, including digital photography, microscopy, robotics, and machine vision. Light field imaging systems can be implemented in various ways, including camera arrays [37, 20, 39], microlens arrays [26, 23], coded masks [33], objective lens arrays [13], and gantry-based camera systems [32]. Among these different implementations, micro-lens array (MLA) based light field cameras offer a cost-effective approach; and it is widely adopted in academic research as well as in commercial light field cameras [1, 2]. MLA-based light field cameras have two limiting issues. The first one is low spatial resolution. Because the image sensor is shared to capture both spatial and angular information, MLA-based light field cameras suffer from a fundamental resolution trade-off between spatial and angular resolution. For example, the first-generation Lytro camera has a sensor of around 11 megapixels, producing 11x11 angular resolution and less than 0.15 megapixel spatial resolution [10]. The second-generation Lytro camera has a sensor of 40 megapixels; however, this large resolution capacity translates to only four megapixel spatial resolution (with the manufacturer s decoding software) due to the angular-spatial resolution trade-off.

2 M. Zeshan Alam, Bahadir K. Gunturk The second issue associated with MLA-based light field cameras is narrow baseline. The distance between sub-aperture images decoded from a light field capture is very small, significantly limiting the depth estimation range and accuracy. For instance, the maximum baseline (between the leftmost and rightmost sub-aperture images) of a first-generation Lytro camera is less than a centimeter, which typically results in sub-pixel feature disparities. There are methods in the literature specifically designed to estimate disparities and depth maps for MLA-based light field cameras [41, 17, 30]. To address both resolution and baseline issues, we propose a hybrid stereo imaging system that consists of a light field camera and a regular camera. 1 The proposed imaging system is shown in Figure 1; it has two main advantages over a single light field camera: First, high spatial resolution image captured by the regular camera is fused with low spatial resolution sub-aperture images of the light field camera to enhance the spatial resolution of each sub-aperture image; that is, a high spatial resolution light field is obtained while preserving the angular resolution. Second, the distance between the light field camera and the regular camera produces a larger baseline compared to the maximum baseline of the light field camera; as a result, the hybrid system has better depth estimation range and accuracy. Hybrid light field imaging systems have been presented previously [6,38,34]. Unlike these previous work, where spatial resolution enhancement is the only goal, we achieve wider baseline through using a calibrated system. With wide baseline, both range and accuracy of depth estimation are improved in addition to spatial resolution enhancement. Fixing cameras as a stereo system also enables offline calibration and rectification, reducing the computational cost. In Section 2, related work addressing spatial resolution and narrow baseline issues of MLA-based light field cameras is provided. The proposed hybrid imaging system with resolution enhancement algorithm is explained in Section 3. Experimental results on resolution enhancement and depth estimation are presented in Section 4. Concluding remarks are given in Section 5. 2 Related Work On low spatial resolution: There are various methods proposed to address the low spatial resolution issue in MLA-based light field cameras. One main approach is 1 A preliminary version of this work was presented as a conference paper [4]. In this paper, we provide additional experiments, detailed algorithm explanation and analysis. to apply super-resolution image restoration to light field sub-aperture images. Super-resolution in a Bayesian framework is commonly used, for example, in [5] with Lambertian and textural priors, in [25] with a Gaussian mixture model, and in [36] with a variational formulation. Learning-based methods are adopted as well, including dictionary-based learning [9] and deep convolutional neural networks [40, 19]. In addition to spatial domain super-resolution restoration, Fourier-domain techniques [27, 28] and wave optics based 3D deconvolution methods [29,7,31,18] have also been utilized. Alternative to the standard MLA-based light field camera design [26], where the MLA is placed at the image plane of the main lens and the sensor is placed at the focal length of the lenslets, there is another design approach where the MLA is placed to relay image from the intermediate image plane of the main lens to the sensor [23]. This design is known as focused plenoptic camera. As in the case of the standard light field camera approach, super-resolution restoration for focused plenoptic cameras is also possible [12]. All single-sensor light field imaging systems are fundamentally limited by the spatial-angular resolution tradeoff, and the above-mentioned restoration methods have performance limitations in addition to the computational costs. Another approach for improving spatial resolution is to use a hybrid two camera system, including a light field camera and a high-resolution camera, and merge the images to improve spatial resolution [6,38,34]. Dictionary-learning based techniques are adopted [6, 38] in this problem as well: High-resolution image patches from the regular camera are extracted and stored as a high-resolution patch dictionary. These high-resolution patches are downsampled; and from the downsampled patches, low-resolution features are extracted to form a low-resolution patch dictionary. During super-resolution reconstruction, a low-resolution image patch is enhanced through determining (based on feature matching) and using the corresponding highresolution patches in the dictionary. In [34], high-resolution image is decomposed with complex steerable pyramid filters; the depth map from the light field is upsampled using joint bilateral upsampling; perspective shift amounts are estimated from the upsampled depth map, and these shift amounts are used to modify the phase of the decomposed high-resolution image; with the modified phases, pyramid reconstruction is applied to obtain high-resolution light field. On narrow baseline: One of the most important features of light field cameras is the ability to estimate depth. However, it is known that depth accuracy and range is limited in MLA-based light field cameras due to narrow baseline. The relation between baseline and

Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range 3 Fig. 1 Hybrid imaging system including a regular and a light field camera. The maximum baseline of the light field camera is limited by the camera main lens aperture, and is much less (about an order of magnitude) than the baseline (about 4cm) between the light field and the regular camera. depth estimation accuracy in a stereo system has been studied in [11]. In a stereo system with focal length f and baseline b, the depth z of a point with disparity d is obtained through triangulation as z = fb/d. With a disparity estimation error of ɛ d, the depth estimation error ɛ z becomes [11]: ɛ z = fb d fb d + ɛ d = d2 ɛ d z2 fb + dɛ d fb ɛ d, (1) which indicates that the depth estimation error is inversely proportional with the baseline and increases quadratically with depth. The disparity error ɛ d is typically set to 1, and the depth estimation error ɛ z as a function depth can be calculated. It is also possible to set an error bound on ɛ z and derive the maximum depth range from the above equation. For an MLA-based light field camera, the maximum baseline is less than the size of the main lens aperture, making depth estimation very challenging. There are methods specifically proposed for depth estimation in MLA-based light field cameras. For example, in [35], the problem is formulated as a constrained labeling problem on epipolar plane images in a variational framework. In [41], ray geometry of 3D line segments is imposed as constraints on light field triangulation and stereo matching. In [30], defocus and shading cues are used to improve the disparity estimation accuracy. 3 Hybrid Stereo Imaging The hybrid stereo imaging system consists of a regular camera and a light field camera as shown in Figure 1. The system has two advantages over a single light field camera: (i) The high-resolution image produced by the regular camera is used to improve the spatial resolution of each sub-aperture image extracted from (a) (b) Fig. 2 (a) Raw light field. (b) Decoded sub-aperture images. the light field camera. That is, we obtain a light field with enhanced spatial resolution. (ii) The large baseline between the regular camera and the light field camera results in a wider range and more accurate depth estimation capability, compared to a single light field camera. (a) (b) (c) Fig. 3 (a) Regular camera image. (b) Regular camera image after photometric registration. (c) One of the bicubically resized Lytro sub-aperture image. 3.1 Prototype System and Initial Light Field Data Processing The prototype system includes a first-generation Lytro camera and a regular camera (AVT Mako G095C). The

4 M. Zeshan Alam, Bahadir K. Gunturk Fig. 4 Illustration of the resolution enhancement process. light field is decoded using [10] to obtain 11x11 subaperture images, each with size 380x380. The regular camera has a spatial resolution of 1200x780 pixels. The imaging system is first calibrated: The regular camera image and the light field middle sub-aperture image is calibrated (utilizing the Matlab Stereo Calibration Toolbox) to determine the overlapping regions between the images and rectify the regular camera image. The regular image is then photometrically mapped to the color space of the light field sub-aperture images using the histogram-based intensity matching function technique [16]. A raw light field data and the extracted sub-aperture images are shown in Figure 2. In Figure 3, the rectified regular camera image is shown along with a light field sub-aperture image. 3.2 Improving Spatial Resolution An illustration of the resolution enhancement process is given in Figure 4. Each low-resolution (LR) light field sub-aperture image is bicubically interpolated to match the size of the high-resolution (HR) regular camera image. The optical flow between the HR image and the light field middle sub-aperture image and the optical flow between the light field middle sub-aperture and every other sub-aperture images are estimated. (We use the optical flow estimation algorithm presented in [22] in our experiments.) Combining these optical flow estimates, motion vectors between the HR image and each of the light field sub-aperture images are obtained. The HR image is warped onto each light field sub-aperture image and fused to produce a high-resolution version of each sub-aperture image. As a result, a high-resolution light field is obtained. Image fusion for resolution enhancement is a wellstudied topic; the application areas include satellite imaging for pan-sharpening, digital camera pipelines for demosaicking, and computational photography for focus stacking [24]. In our experiments, we tested two basic methods for image fusion: (i) a wavelet-based approach [42], available in Matlab as function wfuseimg, which essentially replaces the detail subbands of low-resolution image with the detail subbands of the high-resolution image, and (ii) alpha blending, also available in Matlab as function imfuse, which simply takes the weighted average of input images. We further increase the speed of registration process by using the fact that light field sub-aperture images are captured on a regular grid. Instead of estimating the optical flow between the middle sub-aperture image and each of the remaining sub-aperture images, we estimate the optical flow between the middle and the leftmost, rightmost, topmost, and bottommost sub-aperture images as shown in Figure 5. The estimated motion vectors are interpolated for the rest of the sub-aperture images according to their relative positions within the light field. As a result, we conduct four within-lightfield-camera optical flow estimation (instead of 120) and one between-cameras optical flow estimation. In

Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range 5 4 Experimental Results Fig. 5 Speeding up the optical flow estimation process. Fig. 6 (Top) Residual between the regular camera image and light field sub-aperture images before warping. Two subaperture images are highlighted. (Bottom) Residual between the regular camera image and light field sub-aperture images after warping. Figure 6, we show the difference between the regular camera image and light field sub-aperture images before and after registration. The optical flow within the light field is estimated as described above. The afterregistration result shows that the registration process works well. (Note that the residuals for the sub-aperture images in the aperture corners are large because of the fact that the original sub-aperture images in the corners are too dark due to vignetting.) In this section, we present our experimental results on resolution enhancement and depth estimation. All implementations are done with Matlab, running on an Intel i5 PC with 12GB RAM. For the resolution enhancement process, given in Figure 4, the processing time of an entire Lytro light field is about 70 seconds, in which the optical flow estimation per image pair is about 11 seconds. In Figure 7, we compare a light field sub-aperture image with its resolution-enhanced version. Both the alpha blending and wavelet-based image fusion processes produce good results in terms of resolution enhancement. Alpha blending suppresses the lowspatial-frequency color noise better than the waveletbased approach; this is expected because the waveletbased approach preserves the low-frequency content of the light field images, which have more noise compared to the image obtained from the regular camera. Alpha blending, on the other hand, simply averages two images, reducing the overall noise in all parts of the final image. (In all our experiments, the weights of the HR image and light field sub-aperture images are 0.55 and 0.45, respectively, giving slightly more weight to the HR image in alpha blending.) In Figure 8, we compare the proposed method with the hybrid imaging method of Boominathan et al. [6], the learning-based method of Cho et al. [9], and the baseline decoder of Dansereau et al. [10]. Zoomed-in regions from these results are given in Figure 9. Comparing these results, we see that single-sensor methods cannot perform as well as hybrid methods. Among the hybrid methods, the proposed method produces sharper images than the method given in [6]. In addition to producing sharper images, the proposed method has much less computational complexity than the one in [6], which takes about an hour on a similar PC configuration. Finally, we provide an epipolar-plane image (EPI) comparison in Figure 10. Again, the proposed method seems to be the best in preserving fine features. Refocusing: One of the key features of light field imaging is post-capture digital refocusing through a simple shift-and-sum procedure [20]. In Figure 11, we show refocusing at different distances with Lytro light field images and the resolution-enhanced light field subaperture images. It can clearly be seen that we can obtain sharper refocusing compared to the original Lytro images. In Figure 12, we provide refocusing examples from another data set captured by our imaging system. Again, the resolution-enhanced light field results in higher resolution refocused images compared to the Lytro light field.

6 M. Zeshan Alam, Bahadir K. Gunturk (a) (b) (c) Fig. 7 Resolution enhancement of light field sub-aperture images. (a) One of the bicubically resized Lytro sub-aperture image. (b) Resolution-enhanced sub-aperture image using alpha blending. (c) Resolution-enhanced sub-aperture image using wavelet-based fusion. Dansereau et al. [10] Cho et al. [9] Boominathan et al. [6] Proposed method Fig. 8 Comparison of various light field decoding and resolution enhancement methods. Improving depth range and accuracy: To demonstrate the increased depth range and improved depth estimation accuracy of our hybrid imaging system, we devised an experimental setup, where target objects (i.e, Lego blocks ) are placed in the scene starting from 40cm away from the imaging system. In Figure 13, we show the leftmost and rightmost light field sub-aperture images as well as the regular camera image, in addition to the disparity maps estimated in different ways. Figure 13(g) is the disparity map of the proposed hybrid system, computed between the regular camera image and the middle sub-aperture image using [22]. Figure 13(f) is the disparity map of light field camera, computed between the leftmost and rightmost sub-aperture images using [22]. Figure 13(e) is the disparity map estimated by [8], which uses all the sub-aperture images to estimate the disparity map and specifically designed for micro-lens array based light field cameras. And finally, Figure 13(d) is the disparity map produced by the Lytro manufacturer s proprietary software. Among these different approaches, we see that the proposed system produces the best disparity maps in distinguishing objects from different depths. Figure 13(h), the disparities of the target object positions are plotted. For the light field camera, the disparity difference from one depth to another becomes too small beyond 100cm, making it difficult to distinguish between different depths, and the disparities eventually become sub-pixel beyond 200cm. On the other hand, for the hybrid system, the disparities are large and distinguishable in the same range. Addressing occlusion: The proposed method, which registers high-resolution image and light field sub-aperture images using optical flow estimation, can handle occluded regions to a large extent. This is demonstrated

Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range Dansereau et al. [10] Cho et al. [9] Boominathan et al. [6] 7 Proposed method Fig. 9 Zoomed-in regions from Figure 8. Dansereau et al. [10] Boominathan et al. [6] Proposed method Dansereau et al. [10] Boominathan et al. [6] Proposed method Fig. 10 Comparison of EPI images. The corresponding EPI lines are marked in Figure 8. in Figure 14. In Figure 14(a), the high-resolution regular camera image is shown. A low-resolution Lytro sub-aperture image is given in Figure 14(b). In Figure 14(c), the resolution-enhanced version of the subaperture image is provided. As seen in the zoomed-in regions, the proposed method can handle the occlusion well in most parts as the optical flow based registration aims to minimize the brightness difference. However, there are still some regions, where the difference between the resolution enhanced and low-resolution input is large. One method to detect these occluded regions is to compare the absolute difference against a threshold; when the difference in any color channel is larger than a pre-determined threshold, we can mark the corresponding pixel as occluded. Figure 14(d) shows the occlusion mask when the threshold is set to 0.175 after some trial and error, with the pixel values in the range 0 to 1. The occluded regions can then be filled with the pixel values from the original light field image, as shown in Figure 14(e). In this approach, the value of the threshold is critical. Choosing the threshold too small may cause missing the occlusion regions; on the other

8 M. Zeshan Alam, Bahadir K. Gunturk [Close focus] [Close focus] [Close focus] [Mid focus] [Mid focus] [Mid focus] [Far focus] [Far focus] [Far focus] (a) (b) (c) Fig. 11 Post-capture digital refocusing to close, middle and far depth using the shift-and-sum technique. (a) Single-sensor (Lytro) light field refocusing. (b) Resolution-enhanced (using alpha blending) light field refocusing. (c) Resolution-enhanced (using wavelet-based fusion) light field refocusing. The bottom row shows zoomed-in regions from middle-depth focusing. hand, a large threshold value may lead to transferring noisy and low-resolution data into the final image. 5 Conclusions In this paper, we presented a hybrid imaging system that includes a light field camera and a regular camera. The system, while keeping the capabilities of light field imaging, improves both spatial resolution and depth estimation range/accuracy due to increased baseline. Because the fixed stereo system allows pre-calibration and by utilizing the fact that light field sub-aperture images are captured on a regular grid, the registration of low-resolution light field sub-aperture images and high-resolution regular camera images is simplified. With proper image registration, even a simple image fusion, such as alpha blending, produces good results. The method compares favorably against several methods in the literature, including two single-image light field decoders and a hybrid-camera light field resolution enhancer. References 1. Lytro, Inc. https://www.lytro.com/ 2. Raytrix, GmbH. https://www.raytrix.de/ 3. Adelson, E.H., Bergen, J.R.: The plenoptic function and the elements of early vision. In: Computational Models of Visual Processing, pp. 3 20. MIT Press (1991) 4. Alam, M.Z., Gunturk, B.K.: Hybrid stereo imaging including a light field and a regular camera. In: IEEE Sig-

Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range [Lytro] [Resolution enhanced] [Lytro] [Lytro] [Resolution enhanced] (a) (b) 9 [Resolution enhanced] (c) Fig. 12 Post-capture digital refocusing of single-sensor (Lytro) light field data and resolution-enhanced (using alpha blending) light field data using the shift-and-sum technique. (a) Close-depth focus. (b) Middle-depth focus. (c) Far-depth focus. The bottom row shows zoomed-in regions. 5. 6. 7. 8. 9. 10. 11. nal Processing and Communication Applications Conf., pp. 1293 1296 (2016) Bishop, T.E., Zanetti, S., Favaro, P.: Light field superresolution. In: IEEE Int. Conf. on Computational Photography, pp. 1 9 (2009) Boominathan, V., Mitra, K., Veeraraghavan, A.: Improving resolution and depth-of-field of light field cameras using a hybrid imaging system. In: IEEE Int. Conf. on Computational Photography, pp. 1 10 (2014) Broxton, M., Grosenick, L., Yang, S., Cohen, N., Andalman, A., Deisseroth, K., Levoy, M.: Wave optics theory and 3D deconvolution for the light field microscope. Optics Express 21, 25,418 25,439 (2013) Calderon, F.C., Parra, C.A., Niño, C.L.: Depth map estimation in light fields using an stereo-like taxonomy. In: IEEE Symp. on Image, Signal Processing and Artificial Vision, pp. 1 5 (2014) Cho, D., Lee, M., Kim, S., Tai, Y.W.: Modeling the calibration pipeline of the Lytro camera for high quality light-field image reconstruction. In: IEEE Int. Conf. on Computer Vision, pp. 3280 3287 (2013) Dansereau, D.G., Pizarro, O., Williams, S.B.: Decoding, calibration and rectification for lenselet-based plenoptic cameras. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1027 1034 (2013) Gallup, D., Frahm, J.M., Mordohai, P., Pollefeys, M.: Variable baseline/resolution stereo. In: IEEE Conf. 12. 13. 14. 15. 16. 17. 18. on Computer Vision and Pattern Recognition, pp. 1 8 (2008) Georgiev, T.: New results on the plenoptic 2.0 camera. In: Asilomar Conf. on Signals, Systems and Computers, pp. 1243 1247 (2009) Georgiev, T., Zheng, K.C., Curless, B., Salesin, D., Nayar, S., Intwala, C.: Spatio-angular resolution tradeoffs in integral photography. In: Eurographics Conf. on Rendering Techniques, pp. 263 272 (2006) Gershun, A.: The light field. Journal of Mathematics and Physics 18(1), 51 151 (1939) Gortler, S.J., Grzeszczuk, R., Szeliski, R., Cohen, M.F.: The lumigraph. In: ACM Conf. on Computer Graphics and Interactive Techniques, pp. 43 54 (1996) Grossberg, M.D., Nayar, S.K.: Determining the camera response from images: what is knowable? IEEE Trans. on Pattern Analysis and Machine Intelligence 25(11), 1455 1467 (2003) Jeon, H.G., Park, J., Choe, G., Park, J.: Accurate depth map estimation from a lenslet light field camera. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1547 1555 (2015) Junker, A., Stenau, T., Brenner, K.H.: Scalar waveoptical reconstruction of plenoptic camera images. Applied Optics 53(25), 5784 5790 (2014)

10 M. Zeshan Alam, Bahadir K. Gunturk (a) (b) (c) (d) (e) (f) (g) (h) Fig. 13 Disparity map comparison. (a) Leftmost Lytro sub-aperture image. (b) Rightmost Lytro sub-aperture image. (c) Regular camera image (before photometric registration). (d) Disparity map of the Lytro software. (e) Disparity map of [8]. (f) Disparity map between the leftmost and rightmost Lytro sub-aperture images. (g) Disparity map between the middle Lytro sub-aperture image and the regular camera image. (h) Disparities of the target object centers. 19. Kalantari, N.K., Wang, T.C., Ramamoorthi, R.: Learning-based view synthesis for light field cameras. ACM Trans. on Graphics 35(6), 193:1 10 (2016) 20. Levoy, M., Hanrahan, P.: Light field rendering. In: ACM Int. Conf. on Computer Graphics and Interactive Techniques, pp. 31 42 (1996) 21. Lippmann, G.: Epreuves reversibles donnant la sensation du relief. J. Phys. Theor. Appl. 7(1), 821 825 (1908) 22. Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. In: MIT (2009) 23. Lumsdaine, A., Georgiev, T.: The focused plenoptic camera. In: IEEE Int. Conf. on Computational Photography, pp. 1 8 (2009) 24. Mitchell, H.B.: Image fusion: theories, techniques and applications. Springer (2010) 25. Mitra, K., Veeraraghavan, A.: Light field denoising, light field superresolution and stereo camera based refocussing using a GMM light field patch prior. In: IEEE Conf. on Computer Vision and Pattern Recognition Workshops, pp. 22 28 (2012) 26. Ng, R.: Fourier slice photography. ACM Trans. on Graphics 24(3), 735 744 (2005) 27. Perez, F., Perez, A., Rodriguez, M., Magdaleno, E.: Fourier slice super-resolution in plenoptic cameras. In: IEEE Int. Conf. on Computational Photography, pp. 1 11 (2012) 28. Shi, L., Hassanieh, H., Davis, A., Katabi, D., Durand, F.: Light field reconstruction using sparsity in the continuous Fourier domain. ACM Trans. on Graphics 34(1), 12:1 13 (2014) 29. Shroff, S.A., Berkner, K.: Image formation analysis and high resolution image reconstruction for plenoptic imaging systems. Applied Optics 52(10), 22 31 (2013) 30. Tao, M., Hadap, S., Malik, J., Ramamoorthi, R.: Depth from combining defocus and correspondence using lightfield cameras. In: IEEE Int. Conf. on Computer Vision, pp. 673 680 (2013)

Hybrid Light Field Imaging for Improved Spatial Resolution and Depth Range (a) (b) (c) 11 (d) (e) Fig. 14 Occlusion handling. (a) High-resolution regular camera image. (b) Lytro light field sub-aperture image. (c) Resolution enhanced light field sub-aperture image. (d) Occlusion mask. (e) Occluded regions are filled from Lytro sub-aperture image. 31. Trujillo-Sevilla, J.M., Rodriguez-Ramos, L.F., Montilla, I., Rodriguez-Ramos, J.M.: High resolution imaging and wavefront aberration correction in plenoptic systems. Optics Letters 39(17), 5030 5033 (2014) 32. Unger, J., Wenger, A., Hawkins, T., Gardner, A., Debevec, P.: Capturing and rendering with incident light fields. In: Eurographics Workshop on Rendering, pp. 141 149 (2003) 33. Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., Tumblin, J.: Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. ACM Trans. on Graphics 26(3), 69:1 12 (2007) 34. Wang, X., Li, L., Houi, G.: High-resolution light field reconstruction using a hybrid imaging system. Applied Optics 55(10), 2580 2593 (2016) 35. Wanner, S., Goldluecke, B.: Globally consistent depth labeling of 4D light fields. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 41 48 (2012) 36. Wanner, S., Goldluecke, B.: Spatial and angular variational super-resolution of 4D light fields. In: IEEE Conf. on Computer Vision and Pattern Recognition, pp. 901 908 (2012) 37. Wilburn, B., Joshi, N., Vaish, V., Talvala, E.V., Antunez, E., Barth, A., Adams, A., Horowitz, M., Levoy, M.: High performance imaging using large camera arrays. ACM Trans. on Graphics 24(3), 765 776 (2005) 38. Wu, J., Wang, H., Wang, X., Zhang, Y.: A novel light field super-resolution framework based on hybrid imaging system. In: Visual Communications and Image Processing, pp. 1 4 (2015) 39. Yang, J.C., Everett, M., Buehler, C., McMillan, L.: A real-time distributed light field camera. In: Eurographics Workshop on Rendering, pp. 77 86 (2002) 40. Yoon, Y., Jeon, H.G., Yoo, D., Lee, J.Y., Kweon, I.S.: Learning a deep convolutional network for light-field image super-resolution. In: IEEE Int. Conf. Computer Vision Workshop, pp. 24 32 (2015) 41. Yu, Z., Guo, X., Ling, H., Lumsdaine, A., Yu, J.: Line assisted light field triangulation and stereo matching. In: IEEE Int. Conf. on Computer Vision, pp. 2792 2799 (2013) 42. Zeeuw, P.M.: Wavelet and image fusion. In: CWI (1998)