FOCAL LENGTH CHANGE COMPENSATION FOR MONOCULAR SLAM

Similar documents
Computer Vision. The Pinhole Camera Model

Sequential Algorithm for Robust Radiometric Calibration and Vignetting Correction

Recognizing Words in Scenes with a Head-Mounted Eye-Tracker

On Sampling Focal Length Values to Solve the Absolute Pose Problem

Computer Vision Slides curtesy of Professor Gregory Dudek

Toward an Augmented Reality System for Violin Learning Support

CS6670: Computer Vision

Robust focal length estimation by voting in multi-view scene reconstruction

AR 2 kanoid: Augmented Reality ARkanoid

Distance Estimation with a Two or Three Aperture SLR Digital Camera

Colour correction for panoramic imaging

Multi Viewpoint Panoramas

Computational Rephotography

Unit 1: Image Formation

Simultaneous geometry and color texture acquisition using a single-chip color camera

Sensors and Sensing Cameras and Camera Calibration

Image Processing & Projective geometry

3D and Sequential Representations of Spatial Relationships among Photos

Computational Re-Photography Soonmin Bae, Aseem Agarwala, and Fredo Durand

Photographing Long Scenes with Multiviewpoint

RKSLAM Android Demo 1.0

ON THE CREATION OF PANORAMIC IMAGES FROM IMAGE SEQUENCES

Vignetting Correction using Mutual Information submitted to ICCV 05

Catadioptric Stereo For Robot Localization

Guided Filtering Using Reflected IR Image for Improving Quality of Depth Image

Digital deformation model for fisheye image rectification

Using Line and Ellipse Features for Rectification of Broadcast Hockey Video

multiframe visual-inertial blur estimation and removal for unmodified smartphones

Color Constancy Using Standard Deviation of Color Channels

Visual Servoing. Charlie Kemp. 4632B/8803 Mobile Manipulation Lecture 8

Minimally Intrusive Evaluation of Visual Comfort in the Normal Workplace

Single Camera Catadioptric Stereo System

A Study on the control Method of 3-Dimensional Space Application using KINECT System Jong-wook Kang, Dong-jun Seo, and Dong-seok Jung,

Fast Focal Length Solution in Partial Panoramic Image Stitching

A Comparison Between Camera Calibration Software Toolboxes

Modeling and Synthesis of Aperture Effects in Cameras

Coded Aperture for Projector and Camera for Robust 3D measurement

Overview. Pinhole camera model Projective geometry Vanishing points and lines Projection matrix Cameras with Lenses Color Digital image

Gesture Recognition with Real World Environment using Kinect: A Review

Sensor system of a small biped entertainment robot

ME 6406 MACHINE VISION. Georgia Institute of Technology

Photogrammetric System using Visible Light Communication

Single-view Metrology and Cameras

Image stitching. Image stitching. Video summarization. Applications of image stitching. Stitching = alignment + blending. geometrical registration

PERFORMANCE EVALUATIONS OF MACRO LENSES FOR DIGITAL DOCUMENTATION OF SMALL OBJECTS

Issues in Color Correcting Digital Images of Unknown Origin

YUMI IWASHITA

Telling What-Is-What in Video. Gerard Medioni

EXPERIMENT ON PARAMETER SELECTION OF IMAGE DISTORTION MODEL

Video Synthesis System for Monitoring Closed Sections 1

Supplementary Material of

How do we see the world?

A Comparison of Monocular Camera Calibration Techniques

Multi-Resolution Estimation of Optical Flow on Vehicle Tracking under Unpredictable Environments

IMAGE FORMATION. Light source properties. Sensor characteristics Surface. Surface reflectance properties. Optics

A Structured Light Range Imaging System Using a Moving Correlation Code

Metric Accuracy Testing with Mobile Phone Cameras

Robot Visual Mapper. Hung Dang, Jasdeep Hundal and Ramu Nachiappan. Fig. 1: A typical image of Rovio s environment

Multi-robot Formation Control Based on Leader-follower Method

Annotation Overlay with a Wearable Computer Using Augmented Reality

Reprojection of 3D points of Superquadrics Curvature caught by Kinect IR-depth sensor to CCD of RGB camera

Lecture 2 Camera Models

A Geometric Correction Method of Plane Image Based on OpenCV

Computer Vision. Thursday, August 30

Projection. Readings. Szeliski 2.1. Wednesday, October 23, 13

Improving the Safety and Efficiency of Roadway Maintenance Phase II: Developing a Vision Guidance System for the Robotic Roadway Message Painter

Superfast phase-shifting method for 3-D shape measurement

Selection of Temporally Dithered Codes for Increasing Virtual Depth of Field in Structured Light Systems

Intelligent Traffic Sign Detector: Adaptive Learning Based on Online Gathering of Training Samples

Optical Flow Estimation. Using High Frame Rate Sequences

HDR videos acquisition

Applications of Flash and No-Flash Image Pairs in Mobile Phone Photography

SURVEILLANCE SYSTEMS WITH AUTOMATIC RESTORATION OF LINEAR MOTION AND OUT-OF-FOCUS BLURRED IMAGES. Received August 2008; accepted October 2008

Simultaneous Capturing of RGB and Additional Band Images Using Hybrid Color Filter Array

Novel Hemispheric Image Formation: Concepts & Applications

Light Condition Invariant Visual SLAM via Entropy based Image Fusion

Image Formation: Camera Model

Various Calibration Functions for Webcams and AIBO under Linux

High resolution images obtained with uncooled microbolometer J. Sadi 1, A. Crastes 2

Scientific Image Processing System Photometry tool

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Efficient In-Situ Creation of Augmented Reality Tutorials

A Mathematical model for the determination of distance of an object in a 2D image

Light-Field Database Creation and Depth Estimation

Time of Flight Capture

Artifacts Reduced Interpolation Method for Single-Sensor Imaging System

Image Measurement of Roller Chain Board Based on CCD Qingmin Liu 1,a, Zhikui Liu 1,b, Qionghong Lei 2,c and Kui Zhang 1,d

Design of Temporally Dithered Codes for Increased Depth of Field in Structured Light Systems

Radiometric alignment and vignetting calibration

Driver Assistance for "Keeping Hands on the Wheel and Eyes on the Road"

Waves & Oscillations

Improving Image Quality by Camera Signal Adaptation to Lighting Conditions

Image formation - Cameras. Grading & Project. About the course. Tentative Schedule. Course Content. Students introduction

23270: AUGMENTED REALITY FOR NAVIGATION AND INFORMATIONAL ADAS. Sergii Bykov Technical Lead Machine Learning 12 Oct 2017

Dynamic Distortion Correction for Endoscopy Systems with Exchangeable Optics

Projection. Projection. Image formation. Müller-Lyer Illusion. Readings. Readings. Let s design a camera. Szeliski 2.1. Szeliski 2.

Intro to Virtual Reality (Cont)

Improvement of Accuracy in Remote Gaze Detection for User Wearing Eyeglasses Using Relative Position Between Centers of Pupil and Corneal Sphere

Lecture 2 Camera Models

Today I t n d ro ucti tion to computer vision Course overview Course requirements

Transcription:

FOCAL LENGTH CHANGE COMPENSATION FOR MONOCULAR SLAM Takafumi Taketomi Nara Institute of Science and Technology, Japan Janne Heikkilä University of Oulu, Finland ABSTRACT In this paper, we propose a method for handling focal length changes in the SLAM algorithm. Our method is designed as a pre-processing step to first estimate the change of the camera focal length, and then compensate for the zooming effects before running the actual SLAM algorithm. By using our method, camera zooming can be used in the existing SLAM algorithms with minor modifications. In the experiments, the effectiveness of the proposed method was quantitatively evaluated. The results indicate that the method can successfully deal with abrupt changes of the camera focal length. Index Terms SLAM, Camera Zoom, Augmented Reality 1. INTRODUCTION In augmented reality (AR), camera pose estimation is necessary for achieving geometric registration between the real and virtual worlds. Many kinds of camera pose estimation methods have been proposed in the AR and computer vision research fields. Especially, SLAM-based camera pose estimation is an active research topic. The SLAM-based camera pose estimation method estimates camera pose and 3D structure of the target environment simultaneously. The SLAM algorithms are composed of a tracking process and a mapping process. Natural features in input images are tracked in successive frames, and 3D positions of natural features are estimated in the mapping process. In general, intrinsic camera parameters are calibrated in advance and these parameters are fixed in the SLAM-based camera pose estimation process. This assumption means that the SLAM algorithms do not allow to use a camera zooming, because that would change the camera focal length. In the computer vision research, many types of camera parameter estimation methods have been proposed. These methods can be divided into two groups: camera parameter estimation for known and unknown 3D references. The latter is also often referred to as auto-calibration or self-calibration. Camera parameter estimation from 2D-3D correspondences is known as a Perspective-n-Point (PnP) problem. Many methods for solving the PnP problem have been proposed when the intrinsic camera parameters are unknown [1, 2, 3, 4, 5]. These methods can estimate the focal length and extrinsic camera parameters, but they cannot be used in the unknown environment because all of these methods need several 3D reference points. Camera parameter estimation methods from 2D-2D correspondences have also been proposed [6, 7, 8]. They are usually used in offline 3D reconstruction, such as the structurefrom-motion technique [9]. Although camera parameter estimation from 2D-2D correspondences is possible in unknown environments, these methods are not suitable for SLAM algorithms. For example, the method [6] needs projective reconstruction in advance, and the methods [7, 8] consider two view constraint only. On the other hand, pre-calibration based methods have been proposed [10, 11]. These methods can estimate the focal length and the extrinsic camera parameters accurately using the dependency of each intrinsic camera parameters. In order to make a lookup table of the intrinsic camera parameter dependency, intrinsic camera parameters at each magnification of camera zooming are calibrated in advance. Although the pre-calibration information gives strong constraint in an online camera parameter estimation process, the pre-calibration process decreases the usability of the application. In this research, we focus on SLAM-based camera pose estimation, and we propose a method for handling the focal length change caused by camera zooming. The proposed method is designed as a preprocessing step of the SLAM algorithm. The camera zooming effect in the current image is compensated for by using the estimated focal length change, as shown in Fig. 1. By using the proposed preprocessing method, the existing SLAM algorithms can handle camera zooming. 2. REMOVING THE CAMERA ZOOMING EFFECT The method is composed of four parts, as shown in Fig. 2. In our method, we assume that the principal point is located at the center of the image, aspect ratio is unity, skew is zero, and lens distortion can be ignored. In addition, we assume fixed intrinsic camera parameters in the initialization process of the SLAM algorithm. These assumptions are reasonable for the current consumer camera devices and the SLAM algorithm.

Input image Remove camera zooming effect Compensated image Fig. 1. Image compensation for removing the camera zooming effect. The left image is an input image. The right image is an compensated image by using the estimated focal length change. Mapping Process 1. Bundle adjustment by considering varying focal length 2. Update focal length information of keyframes Tracking Process 1. Initialization of SLAM map 2. Projection matrix estimation of the current frame 3. Focal length estimation 4. Filtering of estimated focal length 5. Image compensation 6. Map tracking Fig. 2. Flow diagram of the proposed method. 2.1. Focal Length Change Estimation The focal length change estimation process is based on the method described in [12]. In this method, focal lengths of each image are estimated from projection matrices of the cameras. Basically, this method has been designed for offline metric reconstruction because projective reconstruction is needed before focal length estimation. We extended this method to achieve sequential focal length estimation. In our approach, the projection matrix of the current frame is estimated using tracked natural features. The focal length change is determined based on the estimated projection matrix and the projection matrices of the keyframes. Projection Matrix Estimation: In order to estimate the projection matrix of the current frame, natural features used for estimating camera parameters of the previous frame are tracked by using the Lucas-Kanade tracker [13]. By using these tracked features, the projection matrix M of the current frame can be estimated by minimizing the following cost function [14]. E p = x i proj(x i ) 2 (1) i S where S represents a set of tracked natural feature points in the current frame, and x i represents the image coordinates of the tracked natural feature i, and proj() is a function for projecting the 3D point X i onto the image using M. The initial estimate of the projection matrix M is obtained with a linear algorithm and then the cost function is minimized by using the Levenberg-Marquardt algorithm. Focal Length Estimation: The focal length of the current frame is estimated from the projection matrices of the current frame and the keyframes. To estimate the focal length, at least three view points are needed [12]. In the map initialization process, two keyframes are used for estimating initial 3D points by a stereo measurement [15]. Because we already have two keyframes after the initialization process, the focal length estimation can be done in real-time during the tracking process. First, the keyframes that have been used for determining 3D positions of tracked natural features are selected from the map. In addition, the first keyframe which is used for initialization is always selected to provide the reference focal length. The relationship between intrinsic camera parameters and projection matrices of the selected keyframes and the current frame can be described as follows: K i K T i = M i Ω M T i (2) where Ω is the absolute quadric that has the 4 4 matrix structure. Intrinsic camera parameter matrices K i and the absolute quadric Ω can be calculated using the rank 3 constraint [12]. Magnification of camera zooming can be estimated from the focal length ratio f 1,t between focal lengths of the first keyframe f 1 and the current frame f t as follows: f 1,t = f 1 / f t (3) It should be noted that the focal length ratio f 1,t can be regarded as the absolute focal length value because there is a scale ambiguity in SLAM-based reconstruction. If the initial focal length is assumed to be 1, the focal length ratio becomes the value of the focal length in the successive frames. 2.2. Robust Filtering The focal length estimation process is sensitive to estimation errors of the projection matrices. In order to achieve stable focal length ratio, we employ two filtering processes: median filtering for robust estimation and temporal filtering for smooth estimation. Median Filtering for Robust Focal Length Estimation: In order to achieve stable estimate, we employ the median filter for estimated focal length ratios obtained by Sec. 2.1. In the

f 1,2 f1,3 f f 2,t 3,t Keyframe 1 Keyframe 2 Keyframe 3 f 1,t... Current frame t Fig. 3. Focal length ratio estimation by median filtering. focal length estimation process, the focal length ratio between the first keyframe and the current frame f 1,t is estimated, and the focal length ratios between other keyframes and the current frame f 2,t, f 3,t,..., f n,t are also estimated as shown in Fig. 3 (n represents the number of selected keyframes). In addition, focal length ratios between the first keyframe and the other keyframes f 1,2, f 1,3,..., f 1,n have already been estimated before the focal length estimation process of the current frame. By using these values, we can obtain candidates of the focal length ratio between the first keyframe and the current frame as follows: f 1,t, f 1,2 f 2,t, f 1,3 f 3,t,..., f 1,n f n,t (4) The median value of these candidates is selected as the focal length ratio between the first keyframe and the current frame f 1,t. Temporal Filtering for Smooth Estimation: After median filtering the focal length ratio still contains some noise that would cause annoying jitter between frames. In order to reduce the effect of the noise we employ temporal filtering for smoothing the estimate. The estimated focal length ratio is filtered by the following equation. ˆf 1,t = α f 1,t + (1 α) ˆf 1,t 1 (5) where ˆf represents the filtered focal length ratio and α represents a coefficient for smoothing. The actual focal length ratio can change in successive frames. In order to tolerate smooth changes, we define the following criteria. f1,t ˆf 1,t 1 < ε1 : Estimated focal length ratio of the current frame should be similar to that of the filtered previous value. f 1,t f 1,t 1 < ε 2 : Similar focal length ratios are estimated in the current and previous frames. f 1,t f 1,t 1 < ε 3 : Gradients of estimated focal lengths are similar. Gradients are calculated by f 1,t = f 1,t f 1,t 1, f 1,t 1 = f 1,t 1 f 1,t 2. The second and third conditions are for detecting the focal length change. If the estimated focal length ratio f t satisfies one or more conditions, f 1,t is accepted and used in the filtering process (Eq.(5)). If all conditions are false, the filtered focal length ratio of the previous frame is used as an input to the filtering process f 1,t = ˆf 1,t 1. In addition, sometimes the focal length ratio cannot be acquired by the focal length estimation method described in Sec.2.1. This happens when the solution for fi 2 in Eq.(2) has a negative value. The filtered focal length ratio of the previous frame is also used in Eq.(5) when fi 2 < 0. Finally, the input image is scaled using the filtered focal length ratio ˆf 1,t. 2.3. Bundle Adjustment In bundle adjustment which is a part of the mapping process shown in Fig. 2, changes of the focal length should be also compensated for. In the proposed method, we modify the cost function for dealing with the scale factor which means the error of focal length ratio estimation in the online process. E = xi j proj i (X j ) 2 (6) i F j P where F and P represent a set of keyframes and a set of reconstructed 3D points respectively. proj i () represents projection of 3D points X j onto the keyframe i. 3D points are projected using extrinsic and intrinsic camera parameters. x i j s i [R i t i ] X j (7) where R i and t i represent rotation and translation components respectively, and s i represents the scale factor for the keyframe i. x i j represents the projected position of X j in the image coordinate system. Solutions for R i, t i, s i, and X j are calculated by minimizing the cost function E using non-linear optimization method such as the Levenberg-Marquardt algorithm. After the optimization process, the focal length ratio of each keyframe is updated. f i,new = f i,old /s i (8) 3. EXPERIMENT To demonstrate the effectiveness of the proposed method, the accuracy of focal length estimation was quantitatively evaluated. In the experiment, we used PTAM [15] as an existing SLAM algorithm. In all experiments, the hardware included a desktop PC (CPU: Corei5-3570 3.4 GHz, Memory: 8.00 GB) and a Sony NEX-VG900 video camera, which records 640 480 pixel images with an optical zoom lens (Sony SEL1018, f = 10mm 18mm). In this experiment, the accuracy of estimated focal length ratio is evaluated with two sequences: non-zoom sequence and zoomed sequence. In the both experiments, first 300 frames are used for initialization, and the focal length is set at a fixed value 1.0.

Estimatited focal length ratio Estimated focal length ratio Focal length estimation error 1.2 1.1 1 0.9 0 200 400 600 800 1000 1200 1400 Fig. 4. The estimation result of focal length ratio in non-zoom sequence. 1.6 1.4 1.2 1 0.6 0.4 0.2 Estimated focal length ratio Reference 0 0 1000 2000 3000 4000 5000 6000 Fig. 5. The estimation result of focal length ratio in zoomed sequence. Non-zoomed Sequence: In this case, the camera moves freely in the real environment, which includes translation and rotation. A maximum distance between the camera and the target scene was about 2 meters. Fig. 4 shows the result of focal length estimation. In this figure, estimated focal length ratios should lie at 1. An average error for focal length estimation was 0.012 and its standard deviation was 0.019. This result confirms that the focal length of the input image was accurately estimated. It also indicates that the proposed method does not have much effect on the accuracy of the conventional SLAM algorithm. Zoomed Sequence: In this case, the camera moves freely in the real environment, which includes translation, rotation, and camera zooming. In order to evaluate the accuracy of focal length estimation, reference focal length values for each image were obtained by an offline reconstruction method [16, 17]. The reference values were obtained at every 30th frames. Figs. 5 and 6 show the result of focal length estimation and its estimation errors in each frame respectively. In Fig. 5, triangle points represent reference focal length ratio obtained from offline reconstruction. An average error for focal length estimation was 0.113 and its standard deviation was 0.109. The result confirms that the proposed method can estimate the focal length change with reasonable accuracy. However, estimated focal length ratio involves a small delay. This delay 0.6 0.4 0.2 0 0 1000 2000 3000 4000 5000 6000-0.2-0.4 Fig. 6. Focal length estimation error in each frame. is caused by the temporal filtering process. In addition, we can observe a large spike around the 4000th frame. At this time, the camera moved along the optical axis with simultaneous zooming. In general, zooming and translation along the optical axis cause an ambiguity which is difficult to handle especially if the scene structure is relatively flat. For SLAM this is probably a rare case, and it could be avoided by adding more heuristics to the algorithm. The execution time for our preprocessing algorithm is shown in Table 1. A half of the processing time for estimating the projection matrix was used by the Lucas-Kanade tracker (5.51 ms). The result confirms that the proposed method still can work in realtime. 4. CONCLUSION In this paper, we proposed a focal length change compensation method for dealing with camera zooming in SLAM algorithms. The main benefit of this method is that the camera zooming effect in the input image can be compensated before the tracking process in SLAM algorithm which enables use of existing SLAM algorithms together with our method. In order to estimate the focal length change, we developed an online focal length estimation framework. In this framework, the estimated focal length is filtered in two stages to achieve more stable result. The effectiveness of the proposed method was demonstrated in the experiments. Table 1. Average computational time for each process. Process time (ms) Projection matrix estimation 11.78 Focal length estimation 0.08 Robust filtering 0.51 Image compensation 0.27 Map tracking 13.58 Total 26.22

5. REFERENCES [1] M A. Abidi and T. Chandra, A new efficient and direct solution for pose estimation using quadrangular targets: Algorithm and evaluation, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 5, pp. 534 538, 1995. [2] B. Triggs, Camera pose and calibration from 4 or 5 known 3D points, Proc. Int. Conf. on Computer Vision, pp. 278 284, 1999. [3] M. Bujnak, Z. Kukelova, and T. Pajdla, A general solution to the P4P problem for camera with unknown focal length, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1 8, 2008. [4] M. Bujnak, Z. Kukelova, and T. Pajdla, New efficient solution to the absolute pose problem for camera with unknown focal length and radial distortion, Proc. Asian Conf. on Computer Vision, pp. 11 24, 2010. [13] B. Lucas and T. Kanade, An iterative image registration technique with an application to stereo vision, Proc. of Int. Joint Conf. on Artificial Intelligence, pp. 674 679, 1981. [14] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, second edition, 2004. [15] G. Klein and D. Murray, Parallel tracking and mapping for small AR workspaces, Proc. Int. Symp. on Mixed and Augmented Reality, pp. 225 234, 2007. [16] Changchang Wu, Towards linear-time incremental structure from motion, Proc. Int. Conf. on 3D Vision, pp. 127 134, 2013. [17] Changchang Wu, S. Agarwal, B. Curless, and S.M. Seitz, Multicore bundle adjustment, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3057 3064, 2011. [5] Z. Kukelova, M. Bujnak, and T. Pajdla, Real-time solution to the absolute pose problem with unknown radial distortion and focal length, Proc. Int. Conf. on Computer Vision, pp. 2816 2823, 2013. [6] Marc Pollefeys, Reinhard Koch, and Luc Van Gool, Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters, Int. J. of Computer Vision, pp. 7 25, 1999. [7] H. Stewenius, D. Nister, F. Kahl, and F. Schaffalitzky, A minimal solution for relative pose with unknown focal length, Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 789 794, 2005. [8] H. Li, A simple solution to the six-point two-view focal-length problem, Proc. European Conf. on Computer Vision, vol. 4, pp. 200 213, 2006. [9] N. Snavely, S. M. Seitz, and R. Szeliski, Photo tourism: Exploring photo collections in 3D, ACM Trans. on GRAPHICS, pp. 835 846, 2006. [10] P. Sturm, Self-calibration of a moving zoom-lens camera by pre-calibration, Int. J. of Image and Vision Computing, vol. 15, pp. 583 589, 1997. [11] T. Taketomi, K. Okada, G. Yamamoto, J. Miyazaki, and H. Kato, Camera pose estimation under dynamic intrinsic parameter change for augmented reality, Computers and Graphics, vol. 44, pp. 11 19, 2014. [12] Marc Pollefeys, Reinhard Koch, and Luc Van Gool, Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters, Int. J. of Computer Vision, pp. 7 25, 1999.