Machine Learning Methods for Ecological Applications

Similar documents
HYBRID NEURAL NETWORK AND EXPERT SYSTEMS

PASSIVE COMPONENTS FOR DENSE OPTICAL INTEGRATION

Advances in Computational and Stochastic Optimization, Logic Programming, and Heuristic Search

acoustic imaging cameras, microscopes, phased arrays, and holographic systems

The Efficient Market Hypothesists

LEARNING, INNOVATION AND URBAN EVOLUTION

ANALOG INTEGRATED CIRCUITS FOR COMMUNICATION Principles, Simulation and Design

ARTIFICIAL NEURAL NETWORKS Learning Algorithms, Performance Evaluation, and Applications

Automotive Painting Technology

Rise of the Knowledge Worker

Design of Logic Systems

Data Assimilation: Tools for Modelling the Ocean in a Global Change Perspective

International Entrepreneurship

The German Chemical Industry in the Twentieth Century

AUTOMATIC MODULATION RECOGNITION OF COMMUNICATION SIGNALS

ANALOG CMOS FILTERS FOR VERY HIGH FREQUENCIES

Historical Materialism and Social Evolution

Economics As a Science of Human Behaviour

The Scientist as Consultant BUILDING NEW CAREER OPPORTUNITIES

TRADE, INNOVATION, ENVIRONMENT

Iowa State University Library Collection Development Policy Computer Science

THE LAYING HEN AND ITS ENVIRONMENT

The New Strategic Landscape

BT Telecommunications Series

INTELLIGENT UNMANNED GROUND VEHICLES Autonomous Navigation Research at Carnegie Mellon

THE WASTE AND THE BACKYARD

THE EFFECTIVENESS OF POLICY INSTRUMENTS FOR ENERGY-EFFICIENCY IMPROVEMENT IN FIRMS

International Entrepreneurship

Quality Management and Managerialism in Healthcare

A GLOBAL PERSPECTIVE ON REAL ESTATE CYCLES

Health Information Technology Standards. Series Editor: Tim Benson

Synthetic Aperture Radar

Computational Principles of Mobile Robotics

Real-time Adaptive Concepts in Acoustics

OIL, ECONOMIC DEVELOPMENT AND DIVERSIFICATION IN BRUNEI DARUSSALAM

Principles of Data Security

MILK and MILK PRODUCTS

This page intentionally left blank

The Relations between Defence and Civil Technologies

Manufacturing Challenges in Electronic Packaging

Risk/Benefit Analysis in Water Resources Planning and Management

MODERN DAIRY TECHNOLOGY

Computer Chess Compendium

Graduate Texts in Mathematics. Editorial Board. F. W. Gehring P. R. Halmos Managing Editor. c. C. Moore

The Early Fiction of H. G. Wells

BIOMEDICAL E T H I C S REVIEWS

Product Development Strategy

Basics of Holography

Computational Intelligence for Network Structure Analytics

Intelligent knowledge based systems in electrical power engineering

COMMUNICATING OUT OF A CRISIS

BEYOND THE STEADY STATE

AN INTRODUCTION TO FIBER OPTICS SYSTEM DESIGN

Progress in Computer Science No.4. Edited by J.Bendey E. Coffman R.L.Graham D. Kuck N. Pippenger. Springer Science+Business Media, LLC

TRANSISTOR CIRCUITS FOR SPACECRAFT POWER SYSTEM

Victorian Telegraphy Before Nationalization

Palgrave Studies in Comics and Graphic Novels. Series Editor Roger Sabin University of the Arts London London, United Kingdom

Time Frequency Domain for Segmentation and Classification of Non-stationary Signals

Perspectives on Development and Population Growth in the Third World

INTEGRATED AUDIO AMPLIFIERS IN BCD TECHNOLOGY

Dramatic Psychological Storytelling

ANALOG CIRCUIT DESIGN

Power Electronics Semiconductor Switches

Macmillan Computer Science Series Consulting Editor Professor F. H. Sumner, University of Manchester

The Economics of Leisure and Recreation

Computer Automation in Manufacturing

DEVELOPMENTS IN INJECTION MOULDING-3

Architectures for Enterprise Integration

The Washington Embassy

Rubber Processing and Production Organization

Electronic Equipment Reliability

STATISTICAL MODELING FOR COMPUTER-AIDED DESIGN OF MOS VLSI CIRCUITS

The Management of Technical Change

Also by Craig Batty Media Writing: A Practical Introduction (with S. Cain, 2010)

HIGH PERFORMANCE COMPUTING IN FLUID DYNAMICS

AUTOMATED BIOMETRICS Technologies and Systems

INTERTEMPORAL PRODUCTION FRONTIERS: WITH DYNAMIC DEA

Crisscrossing Borders in Literature of the American West

FINANCIAL REFORM IN CENTRAL AND EASTERN EUROPE

Modelling Non-Stationary Time Series

FUZZY SETS AND INTERACTIVE MULTIOBJECTIVE OPTIMIZATION

Environment and Development: An Economic Approach

INSTRUMENTATION AND CONTROL SYSTEMS SECOND EDITION

Information, Organisation and Technology: Studies in Organisational Semiotics

NO MORE MUDDLING THROUGH

METHODOLOGY FOR THE DIGITAL CALIBRATION OF ANALOG CIRCUITS AND SYSTEMS

COMPUTERS, BRAINS AND MINDS

Causality, Correlation and Artificial Intelligence for Rational Decision Making

R&D, INNOVATION AND COMPETITIVENESS IN THE EUROPEAN CHEMICAL INDUSTRY

When Values Conflict. Essays on Environmental Analysis, Discourse, and Decision

Vibration of Mechanical Systems

Comparative Responses to Globalization

Arts Management and Cultural Policy Research

Modern Science and the Capriciousness of Nature

Sergey Ablameyko and Tony Pridmore. Machine Interpretation of Line Drawing Images. Technical Drawings, Maps and Diagrams.

Theory and Practice of International Trade Linkage Models

Towards the definition of a Science Base for Enterprise Interoperability: A European Perspective

AUTOMOTIVE CONTROL SYSTEMS

Distributed Detection and Data Fusion

European Commission. 6 th Framework Programme Anticipating scientific and technological needs NEST. New and Emerging Science and Technology

Transcription:

Machine Learning Methods for Ecological Applications

Machine Learning Methods for Ecological Applications edited by Atan H. Fietding Department ofbiological Science the Manchester Metropolitan University " ~. SPRINGER SCIENCE+BUSINESS MEDIA, LLC

Library of Congress Cataloging-in-Publication Data Machine learning methods for ecological applications I edited by Alan H. Fielding. p. cm. Indudes bibliographical references and indexes. ISBN 978-1-4613-7413-8 ISBN 978-1-4615-5289-5 (ebook) DOI 10.1007/978-1-4615-5289-5 1. Ecology. 2. Machine learning. 1. Fielding, Alan. QH540.8.M23 1999 577'.078'5-dc21 99-33708 CIP Copyright 1999 Springer Science+Business Media New York Originally published by Kluwer Academic Publishers in 1999 Softcover reprint of the hardcover 1 st edition 1999 AU rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without. the prior written permission of the publisher, Springer science+business Media, LLC Printed on acid-free paper.

Contents Contributors Preface Acknowledgements 1. An introduction to machine learning methods ALAN FIELDING 2. Artificial neural networks for pattern recognition LYNNE BODDY AND COLIN W. MORRIS 3. Tree-based methods JOHNF. BELL 4. Genetic Algorithms I JOHN N. R. JEFFERS 5. Genetic Algorithms II DAVID R. B. STOCKWELL 6. Cellular automata DAVID DUNKERLEY vii IX xiii 1 37 89 107 123 145

Contents vi 7. Equation discovery with ecological applications 185 SASO DZEROSKI, LJUPCO TODOROVSKI, IVAN BRATKO, BORIS KOMPARE AND VILJEM KRIZMAN 8. How should accuracy be measured? 209 ALAN FIELDING 9. Real learning 225 BARRY STEVENS-WOOD Author Index 247 Subject Index 255

Contributors John F Bell, Examinations Syndicate, University of Cambridge, Cambridge, United Kingdom. Lynne Boddy, Cardiff School of Biosciences, University of Wales, Cardiff CF10 3TL, United Kingdom. Ivan Bratko, JozefStefan Institute, Jamova 39, 1111 Ljubljana, Slovenia and Faculty of Computer and Information Science, Trzaska 25, 1111 Ljubljana, Slovenia. David Dunkerley, Department of Geography and Environmental Science, Monash University, Clayton Victoria 3168, Australia.. Saso Dzeroski, Jozef Stefan Institute, Jamova 39, 1111 Ljubljana, Slovenia. Alan H. Fielding, Behavioural and Environmental Biology Research Group, Biological Sciences, the Manchester Metropolitan University, Manchester, M1 5GD, United Kingdom. John N. R. Jeffers, Applied Statistics Institute, Mathematics Institute, University of Kent, Canterbury CT2 7NF, Kent, United Kingdom. Boris Kompare, Faculty of Civil and Geodetic Engineering, Hajdrihova 28, 1001 Ljubljana, Slovenia. Viljem KriZffian, Jozef Stefan Institute, Jamova 39, 1111 Ljubljana, Slovenia. Colin W. Morris, School of Computing, University of Glamorgan,Trefforest CF37 1DL, United Kingdom. Ljupco Todorovski, Jozef Stefan Institute, Jamova 39, 1111 Ljubljana, Slovenia and Faculty of Medicine, Institute for Biomedical Informatics, Vrazov trg 2, 1105 Ljubljana, Slovenia. Barry Stevens-Wood, Behavioural and Environmental Biology Research Group, Biological Sciences, the Manchester Metropolitan University, Manchester, M1 5GD, United Kingdom. David R. B. Stockwell, University of California San Diego, 9500 Gilman Drive, La Jolla CA 92093, USA. Vll

Preface It is difficult to become an ecologist withou,t acquiring some breadth~ For example, we are expected to be competent statisticians and taxonomists who appreciate the importance of spatial and temporal processes, whilst recognising the potential offered by techniques such as RAPD. It is, therefore, with some trepidation that we offer a collection of potentially useful methods that will be unfamiliar, and possibly alien, to most ecologists. I don't feel old, but when I was undertaking my postgraduate research our lab calculator was mechanical. There was great excitement in my fmal year when we obtained an unbelievably expensive electronic calculator. Later I progressed to running ~obs' on a PRIME minicomputer via a collection of punched cards. Those who complain about the problems with current computers don't know how lucky they are! In 1984 I wrote a book entitled 'Computing for Biologists'. Although it was mainly concerned with writing short programs it did also look at wider aspects of the role of computers in the biological sciences. Machine learning was not mentioned in that book, probably because of ignorance but also because the methods were relatively unknown outside of the relatively small number of workers in the broad field that is now known as machine learning. During 1985 I spent a sabbatical year at York University, following their Biological Computation masters programme. This course was a unique blend of computer science, mathematics and statistics. Although machine learning techniques were beginning to mature, most remained rather esoteric. At the end of the 1980s I became an associate editor for anew journal that aimed to exploit the interface between computer science and the biosciences. One of my tasks with CABIOS (Computer Applications in the BIOSciences) was to write a regular review of relevant literature. As the journal moved inexorably towards a molecular biology

x focus I attempted to spread the reviews over a wider range of biological disciplines. I began to notice an increasing use of machine learning methods and I became more interested in their potential for ecological applications. I had maintained some contacts at York, in particular with David Morse and Marion Edwards. David moved to the computer science laboratory at the University of Kent and we began to discuss the possibility of a parallel journal to CABIOS (now called BioInformatics), but one which focused on ecological issues. Although this journal has not yet come to fruition it did lead to two significant developments. Firstly, we proposed the establishment of an ecological computing special interest group within the British Ecological Society (BES). Secondly, we made contact with Lynne Boddy and Colin Morris who had started an innovative journal called Binary. One of the first meetings held under the auspices of the ecological-computing group, and fmanced by the BES, was a one day workshop based in Manchester (Machine Learning for Ecological Applications, 19 th April 1997). Although this was essentially a meeting of British ecologists there was a welcome contribution from Ivan Bratko. One outcome from this meeting was a general agreement that the best way to promote these techniques to a wider community was via a book. Fortunately the proposal had enthusiastic support from Bob Carling, then a commissioning editor with Chapman & Hall. Although the gestation was longer than I had hoped the book was produced quite quickly once the takeover of Chapman & Hall by Kluwer had been fmalised. One of the aims of the book was to provide, as far as possible, a tutorial approach to a range of methods that have great potential for a knowledge-poor discipline such as ecology. I think that we have largely succeede4, but the book's acceptance by other ecologists will be a better guide. Chapter 1 is an introduction to a range of techniques, issues and examples. Hopefully, this provides a framework and vocabulary for the rest of the book. Chapter 2, by Professor Lynne Boddy and Dr Colin Morris, is one of the longest in the book and deals with the application of artificial neural networks to ecological problems, especially those related to identification. In Chapter 3 Dr John Bell examines the potential of decision trees, a technique that could justifiably be included in a statistical text. Chapters 4 and 5 both deal with the application of genetic algorithms to ecological problems. The first written by Professor John Jeffers examines the application of two rule-based genetic algorithms to a range of ecological problems. Dr David Stockwell, who wrote chapter 5, has continued his postgraduate research into the application of machine learning techniques to ecological problems, moving from Australia to the United.States. He illustrates how genetic algorithms can be used to model the distribution of species. Chapter 6 is the second longest chapter dealing with an approach for which there are no statistical alternatives. Dr David Dunkerley explains how spatial processes can be modelled using small independent cells, or automata, that obey only very simple ndes. Chapter 7 is rather different from the rest since it was written by computer scientists who have developed an interest in ecological problems. In this chapter the authors

explain how equation discovery tools can be used to obtain models of ecological systems. Chapter 8 deals with methods of assessing classification accuracy. This is not a trivial matter, indeed in the abse;lce of significance criteria it is fundamental to assessing the utility of many machine learning algorithms. The fmal chapter is not concerned with the application of machine learning methods to ecological problems. In the previous chapters we have been considering how methods developed by the machine learning community can benefit ecologists. Chapter 9 attempts to reverse that role by providing an overview of 'real' learning, and perhaps suggesting new avenues that the machine learning community could explore. xi Alan Fielding Manchester, May 1999

Acknowledgements David Morse, Bob Carling, Des Thompson and Rory Putman were very positive in their support for this book. The British Ecological Society funded the one day workshop that was the precursor to this volume. The workshop was organised under the auspices of the Ecological Computing group of the BES with support from the Biological Sciences department at MMU. The workshop would never have happened without the organisational skills and enthusiasm of Sue King. I would like to thank all of the contributors for the way in which they responded to my requests for chapters. In particular I am grateful that they all produced work within the deadlines, with only gentle persuasion on my part. Finally, I am very grateful for the support and tolerance shown by family, Sue and Rosie.