Reproducible Research in Computational Science

Similar documents
The Reproducible Research Movement in Statistics

Computational Reproducibility in Medical Research:

Enabling FAIR Data in the Earth, Space, and Environmental Sciences

Disseminating Numerically Reproducible Research

Open Methodology and Reproducibility in Computational Science

Scientific Transparency, Integrity, and Reproducibility

Reproducibility Interest Group

Writing for Publication [Video]

Elements of Scholarly Discourse in a Digital World

PLOS. From Open Access to Open Science : a publisher s perspective. Véronique Kiermer Executive Editor, PLOS Public Library of Science.

Law & Ethics of Big Data Research Dissemination

Applying the Creative Commons Philosophy to Scientific Innovation

The Importance of Scientific Reproducibility in Evidence-based Rulemaking

Journal Policy and Reproducible Computational Research

Scientific Reproducibility and Software

Conservation Biology as an Example of the Dilemmas Facing Scholarly Society Publishing

PLOS. Open Science at PLOS. Open Access Week, October Nicola Stead, Senior Editor, PLOS ONE

Enhancing Reproducibility for Computational Methods

When Should We Trust the Results of Data Science?

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

The Impact of Computational Science on the Scientific Method

Reproducible Research for Scientific Computing: Tools and Strategies for Changing the Culture

Progress in computational science

Reproducibility in Computational Science: Opportunities and Challenges

New forms of scholarly communication Lunch e-research methods and case studies

Open Data, Open Science, Open Access

How Science is Different: Digitizing for Discovery

The modern global researcher:

Tutorial: Open Data. Open Source EHR Summit & Workshop October 17-18, 2012 National Harbor, MD

Science Speaks. Nicholas McCarty University of Iowa. Latham Science Communication Project Copyright 2016 Nicholas McCarty

Research Data and Integrity

Document Downloaded: Wednesday September 16, June 2013 COGR Meeting Afternoon Presentation - Victoria Stodden. Author: Victoria Stodden

Academies outline principles of good science publishing

Open Licensing and Science Policy

Advancing Data Science through a Lifecycle Approach

Two Ideas for Open Science (forget Open Data!)

ScienceDirect: Empowering researchers at every step. Presenter: Lionel New Account Manager, Elsevier Research Solutions

Global Trends in Physics Publishing Background and Developments

Tools for Academic Research: Resolving the Credibility Crisis in Computational Science

The Scholarly Work of Reliable and Well-Designed Mathematical Software

SEMINAR: Preparing research data for open access

HSS Scholars & Scientists Workgroup Report

Keynote Address: "Local or Global? Making Sense of the Data Sharing Imperative"

Publishing in academic journals. Tips to help you succeed

STM Response to Science Foundation Ireland (SFI) Policy Relating to the Open Access Repository of Published Research

Steps toward reproducible research

Steps toward reproducible research

Reproducibility in Computational Science: A Computable Scholarly Record

Why hasn t the journal changed more as a result of the internet?

Global Trends in Neuroscience Publishing Background and Developments

A Journal for Human and Machine

Computer Science at James Madison University

What Is That Patent Really Worth? Courts Take a Hard Look at the "Reasonable Royalty" Calculation Jonathan D. Putnam Competition Dynamics

14 th Berlin Open Access Conference Publisher Colloquy session

The 2018 Publishing Landscape: Technological Horizons. Lyndsey Dixon Editorial Director, APAC Journals Taylor & Francis Group

A new Journal in the field of Innovation Management Editorial Kick-off CERN 31 st May 2016

Ethical, Epistemological, Methodological, Social and Other

The Five R s for Developing Trusted Software Frameworks to increase confidence in, and maximise reuse of, Open Source Software

Publishing in academic journals. Tips to help you succeed

On the Monty Hall Dilemma and Some Related Variations

How to get published. C. H. Juang, PhD, PE Glenn Professor of Civil Engineering Clemson University Co-EIC, Engineering Geology

Structural Biology EURO STRUCTURAL BIOLOGY Theme: Exploring the Future Advancements in Structural and Molecular Biology. 15 th World Congress on

Resource Review. In press 2018, the Journal of the Medical Library Association

Final Program for HiPerNav Workshop on Image Quality Assessment Paris la Défense, 6-7 December 2018

How to write a paper and get it published in a refereed journal

Women into Engineering: An interview with Simone Weber

December 10, Why HPC? Daniel Lucio.

Testimony of Dr. Victoria Stodden Columbia University. Before the House Committee on Science, Space and Technology Subcommittee on Research

DIGITAL SCHOLARSHIP INNOVATION AND DIGITAL LIBRARIES: A SURVEY IN ITALY. Anna Maria Tammaro IRCDL, Firenze 4-5 febbraio 2016

Libraries on the Cutting Edge: The Evolution of The Journal of escience Librarianship

A CyberInfrastructure Wish List for Statistical and Data Driven Discovery

Introduction. amy e. earhart and andrew jewell

ELSEVIER SOLUTIONS TO SUPPORT RESEARCH ACTIVITIES IN REPUBLIC OF KAZAKHSTAN

University of Southern California Guidelines for Assigning Authorship and for Attributing Contributions to Research Products and Creative Works

MINI GUIDE YOU RE NEW TO BUSINESS, AND YOU NEED SOME QUICK, EASY, FUNCTIONAL BRANDING, AND A WEBSITE. HERE S MY TOP TIPS!

What (Exactly) Are Patents Worth at Trial? The Smartphone War Example Jonathan D. Putnam Charles River Associates

Open Science policy and infrastructure support in the European Commission. Joint COAR-SPARC Conference. Porto, 15 April 2015

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING IMPACT FACTOR

JAMES becomes an AGU Journal

A blog is a type of website or part of a website. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of

How CRISs are key to the future of research libraries INCONECSS April 2016 Berlin

Daniel R. Cahoy Smeal College of Business Penn State University VALGEN Workshop January 20-21, 2011

The AIAA and its Use of Genre

ROBERT HOOKE'S CONTRIBUTION S TO MECHANICS

The Early-Career Researcher

1) Evaluating Internet Resources

FINDING & CITING IMAGES IN PAPERS & PRESENTATIONS

Introducing Elsevier Research Intelligence

CRAFTING A RESEARCH PROPOSAL

Software Patents as a Barrier to Scientific Transparency: An Unexpected Consequence of Bayh-Dole

Artificial Intelligence and Robotics Getting More Human

Graduate Studies in Computational Science at U-M. Graduate Certificate in Computational Discovery and Engineering. and

Interface Design V: Beyond the Desktop

Researchers and new tools But what about the librarian? mendeley.com

JOURNAL PUBLISHING IN ASTRONOMY

This list supersedes the one published in the November 2002 issue of CR.

ENABLING REPRODUCIBLE RESEARCH: OPEN LICENSING FOR SCIENTIFIC INNOVATION

Enabling Reproducibility in Computational and Data-enabled Science

Reproducibility in Computationally-Enabled Research: Integrating Tools and Skills

New Jersey Core Curriculum Content Standards for Science

Transcription:

Reproducible Research in Computational Science IPOL, a Research Journal for Image Processing Algorithms and Software Facultad de Ingeniería Universidad de la República Montevideo, UY, April 11th, 2013 Nicolas Limare CMLA, ENS Cachan, FR Image Processing On Line IPOL http://www.ipol.im/ REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 1

A Researcher's Story Let's do research on... REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 2

A Researcher's Story Let's do research on... maté Alejo2083@wikipedia, ZooFari@wikipedia REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 3

Road to HPMDS (High-Perf Maté Dynamics Simulation) Review past research and state of the art theories and methods Create new measurement tools, models and simulation software Compare with existing works Present, publish Drink high-performance maté REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 4

Error 404 How do you compute?? Marcin Wichary REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 5

Ask the author? Code not available Code not usable secret, lost binary-only, not for your OS, obsolete Code not compilable won't debug 2000 obscure lines Code not meant to be read by others Not the exact version used for the article REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 6

Rewrite? Might take some time, days, weeks, months... You won't get much credit for this work Everything is not explained in the article No way to verify that the implementation is correct [...] software is the specification for how the software is supposed to work. Anything less [...] doesn t really tell you anything about how it s ultimately going to behave. And that just makes software really, really hard. Douglas Crockford REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 7

No Software Can not verify Can not reuse Can not reproduce Can not extend Can not compare Can not do science REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 8

Beyond Maté Simulations Sometimes more than a missing code: Misleading performance reports Manipulated figures 2000 retractions in biomedical, 43% for fraud Clinical trials based on wrong assumptions Climategate Public policies based on wrong expectations David Bailey, Twelve ways to fool the masses when giving performance results on parallel computers, Supercomputing Review (1991). Nicholas Wade, It May Look Authentic; Here's How to Tell It Isn't, The New York Times (2006). Ferric C. Fang, R. Grant Steen, and Arturo Casadevall, Misconduct accounts for the majority of retracted scientific publications, PNAS (2012). http://dx.doi.org/10.1073/pnas.1212247109 Kevin R. Coombes, Jing Wang, and Keith A. Baggerly, Irreproducibility of NCI60 Predictors of Chemotherapy, http://bioinformatics.mdanderson.org/supplements/reprorsch-chemo/ Bill Chameides, Climategate Redux, Scientific American (2010). REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 9

But... We can make better science. We are trying with IPOL. REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 10

Scientific Method REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 11

Scientific Method 1200 ~ 1800 Roger Bacon, Francis Bacon, Galileo Galilei, Robert Boyle, René Descartes, Science needs to be reproduced. REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 12

Reproducible Research? Research is reproducible if other researchers can independently obtain the same results from the published material. Theoretical scientists share demonstrations Experimental scientists share procedures Computational scientists (usually) share no software, no full description, no data Ø cf. Claerbout 1992, Donoho 1995, Stonned 201X, Vandewalle 201X Sfoster83@wikipedia, Madprime@wikipedia REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 13

Reproducible (Computational) Research 1990 ~ Jon Claerbout, David Donoho Serguei Fomel, Randy Leveque, David Bailey, Victoria Stodden, Juliana Freire, The science is in the software, data and process. An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. D. Donoho REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 14

Computation Everywhere particle physics fluid dynamics econometrics signal processing quantum chemistry LIDAR archeology MRI analysis climate & weather geophysics CERN, rreis@flickr, rafael grompone, info-nftk@flickr, mohapj@flickr, mario stefanutti, argonne@flickr REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 15

ScienceCodeManifesto.org Code: All source code written specifically to process data for a published paper must be available to the reviewers and readers of the paper. Copyright: The copyright ownership and license of any released source code must be clearly stated. Citation: Researchers who use or adapt science source code in their research must credit the code s creators in resulting publications. Credit: Software contributions must be included in systems of scientific assessment, credit, and recognition. Curation: Source code must remain available, linked to related materials, for the useful lifetime of the publication. REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 16

Why not Share the Code? Code not ready for public view no time/motivation to cleanup, simplify and document Prevent Incorrect Use documentation and explanations again Keep competitive advantage better not publish at all? REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 17

Revisit Objectives of Publishing Articles KEY: lure researchers into sharing their code vs Impact Factor rare picture of an utopian community in the act of sharing their research code freeclipartnow.com REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 18

Revisit Objectives of Publishing Articles Researcher Cite Publish Code Publish Article Community Cite Researcher Community traditional research articles source code Step 1 make the code a publication by itself REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 19

Revisit Objectives of Publishing Articles Researcher Cite Publish Code Publish Article Community Cite Researcher Community traditional research articles source code Step 2 guide the community to use and cite the code REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 20

IPOL REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 21

IPOL IPOL is a research journal of image processing and image analysis. Each article contains a text describing an algorithm and source code, with an online demonstration facility and an archive of online experiments. The text and source code are peer-reviewed and the demonstration is controlled. IPOL follows the Open Access and Reproducible Research models. article = manuscript + software REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 22

IPOL IPOL is a research journal of image processing and image analysis. Each article contains a text describing an algorithm and source code, with an online demonstration facility and an archive of online experiments. The text and source code are peer-reviewed and the demonstration is controlled. IPOL follows the Open Access and Reproducible Research models. article = manuscript + software (+ demo + archive) REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 23

REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 24

Publishing Software IPOL wants to provide reference implementations of image processing algorithms. For every article, the implementation is reviewed and published under GPL/BSD license can be tested online in real time on free data Everything is online, free, reusable. http://ipol.im/ REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 25

Reviewing Software Software is reviewed like a manuscript: manually, by selected reviewers must match the description of the algorithm follows editorial guidelines for correctness, portability, documentation, style This is already a lot asked to image processing researchers. REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 26

IPOL Not a prototype, publishing since 2010 Research journal, self-published ISSN, DOI, editorial policy and int'l board, indexed Partnership with SIAM journal for dual articles + IPOL publishes algorithms, not software; code is here to provide all details to study the algorithm IPOL exists because we need it and no other journal did it REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 27

Reproducible Research Initiatives Journals Science requires that all data and code is available to any reader Math Programming Computation requires the code Biostatistics stamps reproducible articles JMLR publishes software Geophysics has some software guidelines Source Code for Biology and Medicine publishes software, Journal of Open Research Software will too Computing in Science and Engineering reviews software MetaJournals publish articles about software and data REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 28

More Reproducible Research Initiatives Publishers SIAM updated its supp. material policies to include software ACM reformed its supp. material copyright policy Elsevier experiments with executable papers and post-pdf Tools and Services RunMyCode hosts executable research software FLOSShub, mloss/mldata host software DataDryad, Figshare host data Conferences and Workshops ICERM Workshop Dec. 2012 SINTEF Winter School Jan. 2013 SIAM CiSE13 Conference track Feb. 2013 NYU Workshops May 2013 REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 29

IPOL Article: Manuscript+Software Manuscript: description and study of an algorithm Software: complete and documented implementation REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 30

IPOL Article: +Demo Manuscript: description and study of an algorithm Software: complete and documented implementation Demo: universal www interface, test and explore REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 31

IPOL Article: +Archive Manuscript: description and study of an algorithm Software: complete and documented implementation Demo: universal www interface, test and explore Archive: shared test data REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 32

Activity 40 articles published with code and demo since 2011 25 articles under review, 10+ public preprints 100+ citations (cf. Google Scholar) 2012 125.000 visits 13.000 code/data downloads 50.000 demo runs, 30.000 on original data (archived) REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 33

Results Reference implementations of algorithms Verifiable claims on performances and results Algorithms described and analyzed Algorithms improved by mass-testing Implementations improved by review More than reproducible. Reusable and open. REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 34

Challenges Still the work of a small community join and spread the word spin-off to other research areas (next: audio) Competition from less stringent journals and conferences they can evolve by peer-review pressure Reusable is more complex than reproducible software project derived from IPOL Conservative community habits must learn to cite software, article PDF Substantial effort to prepare good code computation at center of computational sciences cursus templates? other ideas? REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 35

Collaboration Work funded by and in collaboration with New participants are welcome! REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 36

Follow-up to... http://ipol.im/ edit@ipol.im discuss@list.ipol.im @IPOL_journal and also http://stodden.net/ http://reproducibleresearch.net/ http://www.runmycode.org/ interested in more authors, editors, reviewers, readers and users productive relations with new researchers assistance and collaboration to new similar projects REPRODUCIBLE RESEARCH IN COMPUTATIONAL SCIENCE 37