Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis Huei-Yung Lin and Chia-Hong Chang Department of Electrical Engineering, National Chung Cheng University, 168 University Rd., Min-Hsiung Chia-Yi 621, Taiwan, R.O.C. lin@ee.ccu.edu.tw, g934156@ccu.edu.tw Abstract. Motion blur is an important visual cue for the illusion of object motion. It has many applications in computer animation, virtual reality and augmented reality. In this work, we present a nonlinear imaging model for synthetic motion blur generation. It is shown that the response of the image sensor is determined by the optical parameters of the camera and can be derived by a simple photometric calibration process. Based on the nonlinear behavior of the image response, photo-realistic motion blur can be obtained and combined with real scenes with least visual inconsistency. Experiments have shown that the proposed method generates more photo-consistent results than the conventional motion blur model. 1 Introduction In the past few years, we have witnessed the convergence of computer vision and computer graphics [1]. Although traditionally regarded as inverse problems of each other, image-based rendering and modeling shares the common ground of these two research fields. One of its major topics is how to synthesize computer images for graphics representations based on the knowledge of a given vision model. This problem commonly arises in the application domains of computer animation, virtual reality and augmented reality [2]. For computer animation and virtual reality, synthetic images are generated from existing graphical models for rendering purposes. Augmented reality, on the other hand, requires the image composition of virtual objects and real scenes in a natural way. The ultimate goal of these applications is usually to make the synthetic images look as realistic as those actually filmed by the cameras. Most of the previous research for rendering synthetic objects into real scenes deal with static image composition, even used for generating synthetic video sequences. When modeling a scene containing a fast moving object during the finite camera exposure time, it is not possible to insert the object directly into the scene by simple image overlay. In addition to the geometric and photometric consistency imposed on the object for the given viewpoint, motion blur or temporal aliasing due to the relative motion between the camera and the scene usually have to be taken into account. It is a very important visual cue to human L.-W. Chang, W.-N. Lie, and R. Chiang (Eds.): PSIVT 26, LNCS 4319, pp. 1273 1282, 26. c Springer-Verlag Berlin Heidelberg 26

1274 H.-Y. Lin and C.-H. Chang perception for the illusion of object motion, and commonly used in photography to illustrate the dynamic features in the scene. For computer generated or stop motion animations with limited temporal sampling rate, unpleasant effects such as jerky or strobing appearance might present in the image sequence if motion blur is not modeled appropriately. Early research on the simulation of motion blur suggested a method by convolving the original image with the linear optical system-transfer function derived from the motion path [3,4]. The uniform point spread function (PSF) was demonstrated in their work, but high-degree resampling filters were later adopted to further improve the results of temporal anti-aliasing [5]. More recently, Sung et al. introduced the visibility and shading functions in the spatial-temporal domain for motion blur image generation [6]. Brostow and Essa proposed a frame-to-frame motion tracking approach to simulate motion blur for stop motion animation [7]. Except for the generation of realistic motion blur, there are also some researchers focusing on real-time rendering using hardware acceleration for interactive graphics applications [8,9]. Although the results are smooth and visually consistent, they are only approximations due to the oversimplified image formation model. It is commonly believed that the image acquisition process can be approximated by a linear system, and motion blur can thus be obtained from the convolution with a given PSF. However, the nonlinear behavior of image sensors becomes prominent when the light source changes rapidly during the exposure time [1]. In this case, the conventional method using a simple box filter cannot create the photo-realistic or photo-consistent motion blur phenomenon. This might not be a problem in purely computer-generated animation, but inconsistency will certainly be noticeable in the image when combining virtual objects with real scenes. Thus, in this work we have proposed a nonlinear imaging model for synthetic motion blur generation. Image formation is modified and incorporated with nonlinear response function. More photo-consistent simulation results are then obtained by using the calibrated parameters of given camera settings. 2 Image Formation Model The process of image formation can be determined by the optical parameters of the lens, geometric parameters of the camera projection model, and photometric parameters associated with the environment and the CCD image sensor. To synthesize an image from the same viewpoint of the real scene image, however, only the photometric aspect of image formation has to be considered. From basic radiometry, the relationship between scene radiance L and image irradiance E is given by E = L π ( ) 2 d cos 4 α (1) 4 f where d, f and α are the aperture diameter, focal length and the angle between the optical axis and the line of sight, respectively [11]. Since the image

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis 1275 is commonly used to represent the image irradiance, it is in turn assumed proportional to the scene radiance for a given set of camera parameters. Thus, most existing algorithms adopt a simple pinhole camera model for synthetic image generation of real scenes. Linear motion blur is generated by convolving the original image with a box filter or a uniform PSF [3]. Although the image synthesis or composition are relatively easy to implement based on the above image formation, the results are usually not satisfactory when compared to the real images captured by a camera. Consequently, photo-realistic scene modeling cannot be accomplished by this simplified imaging model. One major issue which is not explicitly considered in the previous approach is the nonlinear behavior of the image sensors. It is commonly assumed that the image increases linearly with the camera exposure time for any given scene point. However, nonlinear sensors are generally designed to have the output voltage proportional to the log of the light energy for high dynamic range imaging [12,13]. Furthermore, the response function of the image sensors is also affected by the F-number of the camera from our observation. To illustrate this phenomenon, an image printout with white, gray and black stripes is used as a test pattern. Image values under different camera exposures are calibrated for various F-number settings. The plots of value versus exposure time for both the black and gray image stripes 1 are shown in Figure 1. The figures demonstrate that, prior to saturation, the values increase nonlinearly with the exposure times. Although the nonlinear behaviors are not severe for large F-numbers (i.e., small aperture diameters), they are conspicuous for smaller F-numbers. Another important observation is that, even with different scene radiance, the response curves for the black and gray patterns are very similar if the time axis is scaled by a constant. Figure 2 shows the response curves for several F-numbers normalized with respect to the gray and black image patterns. The results suggest that the values of a scene point under different exposure time are governed by the F-number. To establish a more realistic image formation model from the above observations, a monotonically increasing function with nonlinear behavior determined by additional parameters should be adopted. Since the response curves shown in Figure 1 cannot be easily fitted by gamma or log functions with various F-numbers, we model the accumulation versus exposure using an operation similar to the capacitor charging process. The value of an image pixel I(t) is modeled as an inverse exponential function of the integration time t given by I(t) =I max (1 e k d2 ρ t ) for t T (2) where T, I max, d, k and ρ are the exposure time, maximum, aperture diameter, a camera constant, and a parameter related to the object surface s reflectance property, respectively. If all the parameters in Eq. (2) are known, 1 The response of the white image pattern is not shown because it is saturated in a small exposure range.

1276 H.-Y. Lin and C.-H. Chang 3 3 25 25 2 2 15 15 1 1 F=2.4 F=2.4 F=5 F=5 5 F=7.1 5 F=7.1 F=9 F=9 F=11 F=11.1.5 1 1.5 2 333.33 1 2 3 4 5 x 1 4 (a) Intensity versus exposure time for the black (left) and gray (right) patterns. 3 3 25 25 2 2 15 15 1 1 F=2.4 F=2.4 F=5 F=5 5 F=7.1 5 F=7.1 F=9 F=9 F=11 F=11 1 5 333.33 1 (b) Small exposure range clearly shows the nonlinear behavior of the. Fig. 1. Nonlinear behavior of the response curves with different F-numbers then it is possible to determine the value of the image pixel for any exposure time less than T. For a general 8-bit greyscale image, the maximum I max is 255 and I(t) isalwayslessthani max. The aperture diameter d is defined as the F-number divided by the focal length, and can be obtained from the camera settings. The parameters k and ρ are constants for any fixed scene point in the image. Thus, Eq. (2) can be rewritten as I(t) =I max (1 e k t ) (3) for a given set of camera parameters. The only parameter k can then be determined by an appropriate calibration procedure with different camera settings. To verify Eq. (3), we first observe that I() = as expected for any camera settings. The value saturates as t, and the larger the parameter k is, the faster the saturation occurs. This is consistent with the physical model: k contains the reflectance of the scene point and thus represents the irradiance of the image point. Figures 1 and 2 illustrate that, the nonlinear responses are not noticeable for small apertures, but they are evident for large aperture sizes. For either case, the response function can be modeled by Eq. (3)

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis 1277 3 3 25 25 2 2 15 15 1 1 5 exp black gray 5 exp black gray 1 5 1 15 (a) F-2.4, k =.14 1 2 3 4 5 6 (b) F-5, k =.15 3 3 25 25 2 2 15 15 1 1 5 exp black gray 5 exp black gray 1 2 3 5 1 (c) F-7.1, k =.166 1 2 3 5 1 (d) F-11, k =.19 Fig. 2. Normalized response curves for different F-numbers with some constant k. Thus, the most important aspect of the equation is to characterize the image accumulation versus integration time based on the fixed camera parameters. For a given value, it is not possible to determine the exposure time since the image irradiance also depends on the object s reflectance property. However, it is possible to calculate the image of a scene point under any exposure if an -exposure pair is given and the normalized response curve is known for specific camera parameter settings. This is one of the requirements for generating space-variant motion blur as described in the following section. To obtain the normalized response function up to an unknown scale factor in the time domain, the images of the calibration patterns are captured with various exposure followed by least-squared fitting to find the parameter k for different F-numbers. As shown in Figure 2, the resulting fitting curves (black dashed lines) for any given F-number provide good approximation to the actual measurements for both the black and gray patterns. This curve fitting and parameter estimation process can be referred to as photometric calibration for the response function. It should be noted that only the shape of the response curve is significant, the resulting function is normalized in the time axis by an arbitrary scale factor. Given the value of an

1278 H.-Y. Lin and C.-H. Chang image pixel with known camera exposure, the corresponding scene point under different amount of exposure can be calculated by Eq. (3). 3 Synthetic Motion Blur Image Generation Motion blur arises when the relative motion between the scene and the camera is fast during the exposure time of the imaging process. The most commonly used model for motion blur is given by g(x, y) = T f(x x (t),y y (t))dt (4) where g(x, y) andf(x, y) are the blurred and ideal images, respectively. T is the duration of the exposure. x (t) andy (t) are the time varying components of motion in the x and y directions, respectively [3]. If only the uniform linear motion in the x-direction is considered, the motion blurred image can be generated by taking the average of line integral along the motion direction. That is, g(x, y) = 1 R R f(x ρ, y)dρ (5) where R is the extent of the motion blur. Eq. (5) essentially describes the blurred image as the convolution of the original (ideal) image with a uniform PSF h(x, y) = { 1/R, x R/2, otherwise (6) This model is de facto the most widely adopted method for generating motion blur images. Its discrete counterpart used for computation is given by g[m, n] = 1 K K 1 i= f[m i, n] (7) where K is the number of blurred pixels. As an example of using the above image degradation model, motion blur of an ideal step edge can be obtained by performing the spatial domain convolution with the PSF given by Eq. (6). The synthetic result is therefore a ramp edge with the width of the motion blur extent R. If this motion blur model is applied on a real edge image, however, the result is generally different from the recorded motion blur image. Figures 3(a), 3(b) and 3(c) illustrate the images and profiles of an ideal step edge, motion blur edge created using Eq. (7) and real motion blur edge captured by a camera, respectively. As shown in Figure 3(c), the profile indicates that there exists non-uniform weighting on the pixel intensities of real motion blur. Since the curve is not symmetric with respect to its midpoint, this nonlinear response is clearly not due to the optical defocus of the camera and cannot be described by a Gaussian process.

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis 1279 2 18 16 14 12 1 8 6 4 2 1 2 3 4 5 6 7 (a) Ideal step edge and the corresponding profile. 2 18 16 14 12 1 8 6 4 2 1 2 3 4 5 6 7 (b) Motion blur edge generated using Eq. (7). 2 18 16 14 12 1 8 6 4 2 1 2 3 4 5 6 7 (c) Real motion blur edge captured by a camera. 2 18 16 14 12 1 8 6 4 2 1 2 3 4 5 6 7 (d) Motion blur edge synthesized by the proposed method. Fig. 3. Motion blur synthesis of an ideal edge image

128 H.-Y. Lin and C.-H. Chang In this work, motion blur is modeled using nonlinear response of the image sensor as discussed in the previous section. For an image position under uniform motion blur, its value is given by the integration of image irradiance associated with different scene points during the exposure time. Every scene point in the motion path thus contributes the for smaller yet equal exposure time. Although these partial values can be derived from the static image with full exposure by linear interpolation in the time domain, nonlinear behavior of the response should also be taken into account. Suppose the monotonic response function is I(t), then the motion blur image g(x, y) isgivenby ( g(x, y) =I 1 R R ) I 1 (f(x ρ, y))dρ where R is the motion blur extent and I 1 ( ) is the inverse function of I(t). The discrete counterpart of Eq. (8) is given by ( ) K 1 1 g[m, n] =I I 1 (f[m i, n]) (9) K i= where K is the number of blurred pixels. If we consider the special case that the response function I(t) is linear, then Eqs. (8) and (9) are simplified to Eqs. (5) and (7), respectively. Figure 3(d) shows the synthetic motion blur edge of Figure 3(a) and the corresponding profile of the image scanlines generated using Eq. (9). The response function I(t) isgivenbyeq.(3)withf-5andk =.15. By comparing the generated images and profiles with those given by the real motion blur, the proposed model clearly gives more photoconsistent results than the one synthesized using uniform PSF. The fact that brighter scene points contribute more to the image pixels, as shown in Figure 3(c), is successfully modeled by the nonlinear response curve. 4 Results Figure 4 shows the experimental results of a real scene. The camera is placed at about 1 meter in front of the object (a tennis ball). The static image shown in Figures 4(a) is taken at F-5 with exposure time of 1/8 second. Figure 4(b) shows the motion blur image taken under 3 mm/sec. lateral motion of the camera using the same set of camera parameters. The blur extent in the image is 18 pixels, which is used for synthetic motion blur image generation. Figure 4(c) illustrates the motion blur synthesized using the widely adopted uniform PSF for image convolution. The result generated using the proposed nonlinear response function is shown in Figure 4(d). For the color images, red, green and blue channels are processed separately. Motion blur images are first created for each channel using the same response curve, and then combined to form the final result. (8)

Photo-Consistent Motion Blur Modeling for Realistic Image Synthesis (a) Static image. 1281 (b) Real motion blur image. (c) Motion blur generated using Eq. (7). (d) Motion blur by the proposed method. Fig. 4. Experimental results of a real scene Fig. 5. Motion blur generated with small F-number (large aperture size) With careful examination of Figure 4, it is not diﬃcult to ﬁnd that the image shown in Figure 4(d) is slightly better than Figure 4(c). The image scanline proﬁles of Figure 4(d) are very close to those exhibited in the real motion blur image. Figure 5 (left) shows another example taken at F-2.4 with exposure time of 1/8 second. The middle and right ﬁgures are the results using Eq. (7) and the proposed method, respectively. It is clear that the nonlinear behavior becomes prominent and has to be considered for more realistic motion blur synthesis.

1282 H.-Y. Lin and C.-H. Chang 5 Conclusion Image synthesis or composition with motion blur phenomenon have many applications in computer graphics and visualization. Most existing works generate motion blur by convolving the image with a uniform PSF. The results are usually not photo-consistent due to the nonlinear behavior of the image sensors. In this work, we have presented a nonlinear imaging model for synthetic motion blur generation. More photo-realistic motion blur can be obtained and combined with real scenes with least visual inconsistency. Thus, our approach can be used to illustrate dynamic motion for still images, or render fast object motion with limited frame rate for computer animation. References 1. Lengyel, J.: The convergence of graphics and vision. Computer 31(7) (1998) 46 53 2. Kutulakos, K.N., Vallino, J.R.: Calibration-free augmented reality. IEEE Transactions on Visualization and Computer Graphics 4(1) (1998) 1 2 3. Potmesil, M., Chakravarty, I.: Modeling motion blur in computer-generated images. In: Proceedings of the 1th annual conference on Computer graphics and interactive techniques, ACM Press (1983) 389 399 4. Max, N.L., Lerner, D.M.: A two-and-a-half-d motion-blur algorithm. In: SIG- GRAPH 85: Proceedings of the 12th annual conference on Computer graphics and interactive techniques, New York, NY, USA, ACM Press (1985) 85 93 5. Dachille, F., Kaufman, A.: High-degree temporal antialiasing. In: CA : Proceedings of the Computer Animation. (2) 49 54 6. Sung, K., Pearce, A., Wang, C.: Spatial-temporal antialiasing. IEEE Transactions on Visualization and Computer Graphics 8(2) (22) 144 153 7. Brostow, G., Essa, I.: Image-based motion blur for stop motion animation. In: SIGGRAPH 1 Conference Proceedings, ACM SIGGRAPH (21) 561 566 8. Wloka, M.M., Zeleznik, R.C.: Interactive real-time motion blur. The Visual Computer 12(6) (1996) 283 295 9. Meinds, K., Stout, J., van Overveld, K.: Real-time temporal anti-aliasing for 3d graphics. In Ertl, T., ed.: VMV, Aka GmbH (23) 337 344 1. Rush, A.: Nonlinear sensors impact digital imaging. Electronics Engineer (1998) 11. Forsyth, D., Ponce, J.: Computer Vision: A Modern Approach. Prentice-Hall (23) 12. Debevec, P.E., Malik, J.: Recovering high dynamic range radiance maps from photographs. In: SIGGRAPH 97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, ACM Press (1997) 369 378 13. Schanz, M., Nitta, C., Bussman, A., Hosticka, B.J., Wertheimer, R.K.: A highdynamic-range cmos image sensor for automotive applications. IEEE Journal of Solid-State Circuits 35(7) (2) 932 938