Research Notes in Neural Computing Managing Editor Bart Kosko Editorial Board S. Amari M. A. Arbib C. von der Malsburg Advisory Board Y. Abu-Mostafa A. G. Barto E. Bienenstock 1. Cowan M. Cynader W. Freeman G. Gross U. an der Heiden M. Hirsch T. Kohonen 1. W. Moore L. Optic an A. I. Selverston R. Shapley B. Soffer P. Treleaven W. von Seelen B. Widrow S. Zucker
Yi-Tong Zhou Rama Chellappa Artificial Neural Networks for Computer Vision With 61 Illustrations Springer-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong Barcelona Budapest
Yi-Tong Zhou HNC, Inc. 5501 Oberlin Drive San Diego, CA 92121, USA Rama Chellappa Department of Electrical Engineering Center for Automation Research and Institute for Advanced Computer Studies University of Maryland College Park, MD 20742, USA Managing Editor Bart Kosko Department of Electrical Engineering Signal and Image Processing Institute University of Southern California Los Angeles, CA 90089-2564, USA Library of Congress Cataloging-in-Publication Data Zhou, Yi-Tong. Artificial neural networks for computer vision 1 Yi-Tong Zhou, Rama Chellappa. p. cm. - (Research notes in neural computing: v.) Includes bibliographical references and index. ISBN-13:978-0-387-97683-9 e-isbn-13:978-1-4612-2834 9 DOl: 10.1007/978 1 4612 2834 9 1. Neural networks (Computer science) 2. Computer vision. I. Chellappa, Rama. II. Title. III. Series. QA76.87.Z48 1992 006.3-dc20 91-27831 Printed on acid-free paper. 1992 Springer Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone. Production managed by Christin R. Ciresi; manufacturing supervised by Robert Paella. Camera-ready copy provided by the authors. 987654321 ISBN-13:978-0 387-97683-9
To my wife, Linghong, and my daughters, May and Daisy. Yi-Tong Zhou To my wife, Vishnu Priya, and my son, Vivek. Rama Chellappa
Preface This monograph is an outgrowth of the authors' recent research on the development of algorithms for several low-level vision problems using artificial neural networks. Specific problems considered are static and motion stereo, computation of optical flow, and deblurring an image. From a mathematical point of view, these inverse problems are ill-posed according to Hadamard. Researchers in computer vision have taken the "regularization" approach to these problems, where one comes up with an appropriate energy or cost function and finds a minimum. Additional constraints such as smoothness, integrability of surfaces, and preservation of discontinuities are added to the cost function explicitly or implicitly. Depending on the nature of the inversion to be performed and the constraints, the cost function could exhibit several minima. Optimization of such nonconvex functions can be quite involved. Although progress has been made in making techniques such as simulated annealing computationally more reasonable, it is our view that one can often find satisfactory solutions using deterministic optimization algorithms. In this monograph, we present deterministic optimization algorithms using artificial neural networks of the Amari-Hopfield type for several lowlevel vision problems. For each one of these problems, we have identified appropriate constrained cost functions. For instance, features based on estimated first derivatives and Gabor wavelets are used for defining the cost function used in stereo matching. A cost function using features based on principal curvatures is defined for computation of optical flow. For the motion stereo problem, the cost function used for static stereo is extended to multiple frames, leading to a recursive solution. Subsequently, the results of minimizing these cost functions are presented for several synthetic and real images. Practical issues such as the effects of spatial quantization, detection of occluding pixels, choice of window size used for estimation of derivative, and detection of motion discontinuities are discussed in detail. Thus, the emphasis is on engineering the artificial neural networks for several imagerelated problems. Some of the algorithms presented in this monograph have already been implemented in VLSI hardware by Professor B. Sheu and his students at the University of Southern California. Optical implementations of image deblurring and stereo matching algorithms have been investigated
viii Preface by Professor B.K. Jenkins and his students. It is our hope that this monograph serves as a practical book for engineers and scientists interested in exploiting the computational power of artificial neural networks for image processing and computer vision problems. During the last five years, we have been tremendously inspired and influenced by many of our distinguished colleagues at the University of Southern California. In particular, we would like to thank Professor M.A. Arbib, Director of the Center for Neural Engineering, for his unbridled enthusiasm and leadership in neural-related research activities in USC. We would like to acknowledge the profound influence Professor C. von der Malsburg has had on the work reported here and subsequent work the second author has done with Dr. B.S. Manjunath. We would also like to thank Professor B.A. Kosko for his dynamic and thought-provoking interactions and his continued encouragement to complete this work. We would like to thank Dean L.M. Silverman, Dr. J.M. Mendel, Dr. A.A. Sawchuk, Dr. B.K. Jenkins, Dr. B. Sheu, Dr. A. Weber, Dr. B.S. Manjunath, Ms. Linda Varilla, and Ms. Delsa Tan for their encouragement, helpful discussions, and assistance. The first author would also like to thank Dr. Robert Hecht-Nielsen, Dr. Robert L. North, Mr. Todd Gutshow, Dr. Robert Means, Mr. Richard Crawshaw, Mr. Chris Platt, and Ms. Sherri Mieth of HNC for their advice, support and help. Thanks are also due to the editorial staff at Springer-Verlag for their encouragement and patience during the preparation of this monograph. Finally, this work would not have been completed without the support of our families. The research reported in this monograph was supported by the AFOSR Grant 86-0196 and by the Center for Integration of Optical Computing, which was supported by the AFOSR Contract F-49620-87-C-007 and the AFOSR Grant 90-0133. San Diego, California Los Angeles, California Yi-Tong Zhou Rama Chellappa
Contents Preface 1 Introduction 1.1 Neural Methods. 1.2 Plan of the Book 2 Computational Neural Networks 2.1 Introduction.... 2.2 Amari and Hopfield Networks.. 2.3 A Discrete Neural Network for Vision 2.3.1 A Discrete Network 2.3.2 Decision Rules 2.4 Discussion. vii 1 1 4 6 6 7 10 10 11 13 3 Static Stereo 15 3.1 Introduction... 15 3.2 Depth from Two Views... 18 3.3 Estimation of Intensity Derivatives 19 3.3.1 Fitting Data Using Chebyshev Polynomials 20 3.3.2 Analysis of Filter M(y)... 21 3.3.3 Computational Consideration for the Natural Images 24 3.4 Matching Using a Network... 28 3.5 Experimental Results....... 32 3.5.1 Random Dot Stereograms 32 3.5.2 Natural Stereo Images. 33 3.6 Discussion...,... 34 4 Motion Stereo-Lateral Motion 4.1 Introduction.... 4.2 Depth from Lateral Motion.. 4.3 Estimation of Measurement Primitives 4.3.1 Estimation of Derivatives... 4.3.2 Estimation of Chamfer Distance Values 44 44 46 47 47 49
x Contents 4.4 Batch Approach.... 4.4.1 Estimation qf PiAel Positions 4.4.2 Batch Formulation. 4.5 Recursive Approach.... 4.6 Matching Error......... 4.7 Detection of Occluding Pixels. 4.8 Experimental Results...... 4.9 Discussion.... 5 Motion Stereo-Longitudinal Motion 5.1 Introduction.... 5.2 Depth from Forward Motion.... 5.2.1 General Case: Images Are Nonequally Spaced 5.2.2 Special Case: Images Are Equally Spaced 5.3 Estimation of the Gabor Features... 5.3.1 Gabor Correlation Operator.... 5.3.2 Computational Considerations.... 5.4 Neural Network Formulation 5.5 Experimental Results..... 5.6 Discussion.... 6 Computation of Optical Flow 6.1 Introduction... 6.2 Estimation of Intensity Values and Principal Curvatures 6.2.1 Estimation of Polynomial Coefficients.. 6.2.2 Computing Principal Curvatures.... 6.2.3 Analysis of Filters........... 6.3 Neural Network Formulation... 6.3.1 Physiological Considerations. 6.3.2 Computational Considerations 6.3.3 Computing Flow Field..... 6.4 Detection of Motion Discontinuities... 6.5 Multiple Frame Approaches 6.5.1 Batch Approach.. 6.5.2 Recursive Algorithm 6.5.3 Detection Rules... 6.6 Experimental Results.... 6.6.1 Synthetic Image Sequence. 6.6.2 Natural Image Sequence. 6.7 Discussion... 50 50 51 52 53 56 59 59 63 63 65 65 70 71 71 72 77 80 80 83 83 87 87 88 89 92 92 94 97 101 106 106 107 108 109 110 111 112
7 Image Restoration 7.1 Introduction.... 7.2 An Image Degradation Model. 7.3 Image Representation... 7.4 Estimation of Model Parameters 7.5 Restoration.... 7.6 A Practical Algorithm... 7.7 Computer Simulations... 7.8 Choosing Boundary Values 7.9 Comparisons to Other Restoration Methods. 7.9.1 Inverse Filter and SVD Pseudoinverse Filter. 7.9.2 MMSE and Modified MMSE Filters 7.10 Optical Implementation 7.11 Discussion.... 8 Conclusions and Future Research 8.1 Conclusions... 8.2 Future Research.... Bibliography Index Contents xi 122 122 124 126 128 130 131 133 135 139 140 140 141 146 147 147 147 151 163