Foundations of Genetic Programming
Springer-Verlag Berlin Heidelberg GmbH
William B. Langdon Riccardo Poli Foundations of Genetic Programming With 117 Figures and 12 Tables Springer
William B. Langdon Computer Science University College, London Gower Street London, WCIE 6BT UK W.Langdon@cs.uel.ac.uk Riccardo Poli Department of Computer Science The University of Essex Wivenhoe Park Colchester, C04 3SQ UK rpoli@essex.ac.uk Library of Congress Cataloging-in-Publication Data Langdon, W.B. (William B.) Foundations of genetic programming/william B. Langdon, Riccardo Poli. p.cm. Includes bibliographical references and index. ISBN 978-3-642-07632-9 ISBN 978-3-662-04726-2 (ebook) DOI 10.1007/978-3-662-04726-2 1. Genetic programming (Computer science) I. Poli, Riccardo, 1961-1I. Title QA 76.623.L35 2001 006.3' l-dc21 2001049394 ACM Subject Classification (1998): F.1.1, D.1.2-3, G.2.1, G.1.6, G.1.2, E.1, G.3, 1.2.6, 1.2.8, 1.1.1-3 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. http://www.springer.de Springer-Verlag Berlin Heidelberg 2002 Originally published by Springer-Verlag Berlin Heidelberg New York in 2002 Softcover reprint of the hardcover 1 st edition 2002 The use of general descriptive names, trademarks, etc. in this publicat ion does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: Camera-ready by the authors Cover Design: design & production, Heidelberg Printed on acid-free paper SPIN 11395881-45/31l1SR - 5 4 3 21
To Caterina and Ludovico R.P.
Preface Genetic programming (GP) has been highly successful as a technique for getting computers to automatically solve problems without having to tell them explicitly how to do it. Since its inception more than ten years ago genetic programming has been used to solve practical problems but along with this engineering approach there has been interest in how and why it works. This book consolidates this theoretical work. One of the goals of any theoretical work is to better understand the subject. This is useful in its own right and as an aid to designing improvements. We will describe several new genetic operators that arose naturally from theoretical work and suggest modest changes to the way existing GP systems could be used on specific problems to yield improved performance. No doubt these operators and suggestions will be of direct practical interest, even to those who are not interested in "theory" for its own sake. Genetic programming is one of a wide range of evolutionary computation techniques, such as evolutionary strategies and evolutionary programming, being itself a descendent of one of the oldest, Genetic Algorithms (GAs). It is nice to be able to report in this book that theoretical results from the "new boy", GP, can be directly applied to GAs. Since GP is more expressive than GAs, it can be viewed as a generalisation of GAs. In the same way, GP theory is a generalisation of GA theory, although, in fact, some recent advances in GP theory came first and the corresponding GA theory was derived by specialising the more general GP theory. In effect we are getting GA theory for free, from the GP theory. In this way the various strands of evolutionary computation theory are themselves coming together (although convergence is some way off). The title of our book has, itself, a genetic pedigree. Its direct ancestor is a workshop of the same name held at the first Genetic and Evolutionary Computation Conference [Banzhaf et al., 1999], which we organised (together with Una-May O'Reilly, Justinian Rosca and Thomas Haynes) in July 1999, in Orlando. Prior to this (starting in 1990) there has been a long-running series of workshops called Foundations of Genetic Algorithms (FOGA). More generally, the inspiration for "Foundations of Genetic Programming" came from a panel called "The next frontiers of AI: the role of foundations", held at EPIA 1995 [Pinto-Ferreira and Mamede, 1995]. On that occasion Riccardo
VIn Preface put forward the view that the foundations of Artificial Intelligence (AI) are fundamental principles which are common to all disciplines within AI, be they artificial neural networks, evolutionary computation, theorem proving, etc. (see figures on the next page). The common feature of these techniques is search (although the representation being used to express solutions and the search used may be radically different). In our opinion search (be it deterministic or stochastic, complete or incomplete, blind, partially sighted, heuristic, etc.), the related representation, operators and objective functions are the foundations of AI. So Foundations of Genetic Programming should not be viewed only as a collection of techniques that one needs to know in order to be able to do GP well but also as a first attempt to chart and explore the mechanisms and fundamental principles behind genetic programming as a search algorithm. In writing this book we hoped to cast a tiny bit of light onto the theoretical foundations of Artificial Intelligence as a whole. Acknowledgements We would like to thank Andy Singleton, Trevor Fenner, Tom Westerdale, Paul Vitanyi, Peter Nordin, Wolfgang Banzhaf, Nic McPhee, David Fogel, Tom Haynes, Sidney R. Maxwell III, Peter Angeline, Astro Teller, Rafael Bordini, Lee Spector, Lee Altenberg, Jon Rowe, Julian Miller, Xin Yao, Kevin P. Lucas, Martijn Bot, Robert Burbidge, Michael O'Neill, the people of the Chair of Systems Analysis (University of Dortmund), the Centrum voor Wiskunde en Informatica, Amsterdam, University College, London, and the members of the EEBIC group at the University of Birmingham. We would also like to thank Axel Grossmann, Aaron Sloman, Stefano Cagnoni, Jun He, John Woodward, Vj Varma, Tim Kovacs, Marcos Quintana Hernandez and Peter Coxhead for their useful comments on drafts of the book. Finally, we would like to thank numerous anonymous referees of our work over several years for particularly helpful comments and suggestions. October 2001 W.E. Langdon Riccardo Poli
Preface IX Genetic Algorithms Artificial Intelligence can b e seen as a cluster of islands in the sea. Neural etwork Classical Artificial Intelligence Artificial Intelligence can be seen as a cluster of islands in the sea sharing a set of common foundations (cross-sectional view).
Contents 1. Introduction... 1 1.1 Problem Solving as Search............................... 2 1.1.1 Microscopic Dynamical System Models.............. 4 1.1.2 Fitness Landscapes............................... 4 1.1.3 Component Analysis.............................. 6 1.1.4 Schema Theories................................. 7 1.1.5 No Free Lunch Theorems.......................... 8 1.2 What is Genetic Programming?.......................... 9 1.2.1 Tree-based Genetic Programming................... 10 1.2.2 Modular and Multiple Tree Genetic Programming.... 11 1.2.3 Linear Genetic Programming...................... 13 1.2.4 Graphical Genetic Programming................... 14 1.3 Outline of the Book..................................... 15 2. Fitness Landscapes..................... 17 2.1 Exhaustive Search...................................... 17 2.2 Hill Climbing... 17 2.3 Fitness Landscapes as Models of Problem Difficulty......... 19 2.4 An Example GP Fitness Landscape....................... 20 2.5 Other Search Strategies................................. 21 2.6 Difficulties with the Fitness Landscape Metaphor........... 23 2.7 Effect of Representation Changes......................... 25 2.8 Summary... 26 3. Program Component Schema Theories.................... 27 3.1 Price's Selection and Covariance Theorem................. 28 3.1.1 Proof of Price's Theorem.......................... 29 3.1.2 Price's Theorem for Genetic Algorithms............. 31 3.1.3 Price's Theorem with Tournament Selection......... 31 3.1.4 Applicability of Price's Theorem to GAs and GPs.... 32 3.2 Genetic Algorithm Schemata............................. 33 3.3 From GA Schemata to GP Schemata... '...... 35 3.4 Koza's Genetic Programming Schemata... '. 38 3.5 Altenberg's GP Schema Theory... "................ 39
XII Contents 3.6 O'Reilly's Genetic Programming Schemata... 43 3.7 Whigham's Genetic Programming Schemata............... 45 3.8 Summary... 46 4. Pessimistic G P Schema Theories... " 49 4.1 Rosca's Rooted Tree Schemata........................... 49 4.2 Fixed-Size-and-Shape Schemata in GP.................... 51 4.3 Point Mutation and One-Point Crossover in GP............ 56 4.4 Disruption-Survival GP Schema Theorem.................. 60 4.4.1 Effect of Fitness Proportionate Selection... 60 4.4.2 Effect of One-Point Crossover...................... 61 4.4.3 Effect of Point Mutation... 65 4.4.4 GP Fixed-size-and-shape Schema Theorem... 65 4.4.5 Discussion... 66 4.4.6 Early Stages of a GP Run......................... 66 4.4.7 Late Stages of a GP Run.......................... 67 4.4.8 Interpretation... 68 4.5 Summary... 68 5. Exact GP Schema Theorems... 69 5.1 Criticisms of Schema Theorems... 69 5.2 The Role of Schema Creation............................ 71 5.3 Stephens and Waelbroeck's GA Schema Theory............ 73 5.4 GP Hyperschema Theory................................ 74 5.4.1 Theory for Programs of Fixed Size and Shape... 74 5.4.2 Hyperschemata... 77 5.4.3 Microscopic Exact GP Schema Theorem............. 77 5.4.4 Macroscopic Schema Theorem with Schema Creation. 80 5.4.5 Macroscopic Exact GP Schema Theorem............ 82 5.5 Examples... 83 5.5.1 Linear Trees..................................... 83 5.5.2 Comparison of Bounds by Different Schema Theorems 87 5.5.3 Example of Schema Equation for Binary Trees....... 89 5.6 Exact Macroscopic Schema Theorem for GP with Standard Crossover... 89 5.6.1 Cartesian Node Reference Systems... 90 5.6.2 Variable Arity Hyperschema....................... 91 5.6.3 Macroscopic Exact Schema Theorem for GP with Standard Crossover.......................... 92 5.7 Summary... 95 6. Lessons from the GP Schema Theory... 97 6.1 Effective Fitness........................................ 97 6.1.1 Goldberg's Operator-Adjusted Fitness in GAs........ 97 6.1.2 Nordin and Banzhaf's Effective Fitness in GP........ 98
Contents XIII 6.1.3 Stevens and Waelbroeck's Effective Fitness in GAs... 99 6.1.4 Exact Effective Fitness for GP... 100 6.1.5 Understanding GP Phenomena with Effective Fitness. 100 6.2 Operator Biases and Linkage Disequilibrium for Shapes... 105 6.3 Building Blocks in GAs and GP... 107 6.4 Practical Ideas Inspired by Schema Theories... 109 6.5 Convergence, Population Sizing, GP Hardness and Deception 110 6.6 Summary... 111 7. The Genetic Programming Search Space... 113 7.1 Experimental Exploration of GP Search Spaces... 113 7.2 Boolean Program Spaces... 114 7.2.1 NAND Program Spaces... 114 7.2.2 Three-Input Boolean Program Spaces... 119 7.2.3 Six-Input Boolean Program Spaces... 119 7.2.4 Full Trees... 123 7.3 Symbolic Regression... 123 7.3.1 Sextic Polynomial Fitness Function... 124 7.3.2 Sextic Polynomial Fitness Distribution... 124 7.4 Side Effects, Iteration, Mixed Arity: Artificial Ant... 124 7.5 Less Formal Extensions... 127 7.5.1 Automatically Defined Function... 127 7.5.2 Memory... 128 7.5.3 Turing-Complete Programs... 128 7.6 Tree Depth... 129 7.7 Discussion... 130 7.7.1 Random Trees... 130 7.7.2 Genetic Programming and Random Search... 131 7.7.3 Searching Large Programs... 131 7.7.4 Implications for GP... 131 7.8 Conclusions... 132 8. The GP Search Space: Theoretical Analysis... 133 8.1 Long Random Linear Programs... 133 8.1.1 An Illustrative Example... 135 8.1.2 Rate of Convergence and the Threshold... 136 8.1.3 Random Functions... 138 8.1.4 The Chance of Finding a Solution... 139 8.2 Big Random Tree Programs... 139 8.2.1 Setting up the Prooffor Trees... 139 8.2.2 Large Binary Trees... 142 8.2.3 An Illustrative Example... 143 8.2.4 The Chance of Finding a Solution... 144 8.2.5 A Second Illustrative Example... 144 8.3 XOR Program Spaces... 145
XIV Contents 8.3.1 Parity Program Spaces... 145 8.3.2 The Number of Parity Solutions... 146 8.3.3 Parity Problems Landscapes and Building Blocks... 148 8.4 Conclusions... 150 9. Example I: The Artificial Ant... 151 9.1 The Artificial Ant Problem... 151 9.2 Size of Program and Solution Space... 154 9.3 Solution of the Ant Problem... 157 9.3.1 Uniform Random Search... 157 9.3.2 Ramped Half-and-Half Random Search... 157 9.3.3 Comparison with Other Methods... 158 9.4 Fitness Landscape... 158 9.5 Fixed Schema Analysis... 159 9.5.1 Competition Between Programs of Different Sizes... 160 9.5.2 Competition Between Programs of Size 11... 162 9.5.3 Competition Between Programs of Size 12... 163 9.5.4 Competition Between Programs of Size 13... 164 9.6 The Solutions... 167 9.7 Discussion... 168 9.8 Reducing Deception... 170 9.9 Conclusions... 171 10. Example II: The Max Problem... 175 10.1 The MAX Problem... 176 10.2 GP Parameters... 176 10.3 Results... 176 10.3.1 Impact of Depth Restriction on Crossover... 178 10.3.2 Trapping by Suboptimal Solutions... 178 10.3.3 Modelling the Rate of Improvement... 179 10.3.4 Number of Steps to Climb the Hill... 182 10.4 Variety... 183 10.4.1 Variety in the Initial Population... 183 10.4.2 Evolution of Variety... 184 10.4.3 Modelling Variety... 185 10.5 Selection Pressure... 186 10.6 Applying Price's Covariance and Selection Theorem... 189 10.7 Conclusions... 192 11. GP Convergence and Bloat... 193 1l.1 Convergence... 193 1l.2 Bloat... 197 11.2.1 Examples of Bloat... 198 11.2.2 Convergence of Phenotype... 198 1l.2.3 Theories of Bloat... 199
Contents XV 11.2.4 Fitness Variation is Needed for Bloat... 201 11.3 Subquadratic Bloat... 202 11.3.1 Evolution of Program Shapes... 203 11.3.2 Experiments... 206 11.3.3 Results.......................................... 207 11.3.4 Convergence... 211 11.4 Depth and Size Limits... 211 11.5 Discussion... 212 11.6 AntiBloat Techniques... 214 11.7 Conclusions... 216 12. Conclusions... 219 A. Genetic Programming Resources... 223 Bibliography.................................................. 225 List of Special Symbols....................................... 241 Glossary...................................................... 247 Index... 255