Philosophy of data- intensive science. Sabina Leonelli Department of Sociology, Philosophy and Anthropology & Egenis University of Exeter

Similar documents
ORE Open Research Exeter

Circulating Evidence Across Research Contexts: The Locality of Data and Claims in Model Organism Research

Big Data Analytics in Science and Research: New Drivers for Growth and Global Challenges

An Introduction to SIMDAT a Proposal for an Integrated Project on EU FP6 Topic. Grids for Integrated Problem Solving Environments

Ken Buetow, Ph.D. Director, Computation Science and Informatics, Complex Adaptive ASU Professor, School of Life Science

Birger Hjorland 101 Neil Pollock June 2002

Scientific Transparency, Integrity, and Reproducibility

RecordDNA DEVELOPING AN R&D AGENDA TO SUSTAIN THE DIGITAL EVIDENCE BASE THROUGH TIME

Big Data and the Question of Objectivity

Scientific Data e-infrastructures in the European Capacities Programme

Research group self-assessment:

EXTENDED TABLE OF CONTENTS

The Challenge of Semantic Integration and the Role of Ontologies Nicola Guarino ISTC-CNR

Open Data, Open Science, Open Access

Fifth Framework Programme for Research, Technological Development and Demonstration Quality of Life and Management of Living Resources

Working Paper Series of the German Data Forum (RatSWD)

JBA ABS Symposium on Digital Sequence Information. 28 February 2018 Tokyo

UKRI Artificial Intelligence Centres for Doctoral Training: Priority Area Descriptions

Journal Policy and Reproducible Computational Research

Biopolitics to Molecular Biopolitics: From Michael Foucault to Nikolas Rose

Funding New Innovations

FINAL ACTIVITY AND MANAGEMENT REPORT

On the moral economy of digital infrastructures: Sharing, usability and publicness

Information & Communication Technology Strategy

Explaining Interdisciplinary Studies

Re-engineering Collaborative Mechanisms and Knowledge Networks to Accelerate Innovation for Alzheimer s

no.10 ARC PAUL RABINOW GAYMON BENNETT ANTHONY STAVRIANAKIS RESPONSE TO SYNTHETIC GENOMICS: OPTIONS FOR GOVERNANCE december 5, 2006 concept note

Science as an Open Enterprise

Media and Communication (MMC)

A New Path for Science?

A Journal for Human and Machine

Computational Reproducibility in Medical Research:

Reproducibility Interest Group

Biology Foundation Series Miller/Levine 2010

Beyond BIM. Knowledge management for a smarter built environment. Shaun Howell and Yacine Rezgui

A Bibliometric Analysis of Australia s International Research Collaboration in Science and Technology: Analytical Methods and Initial Findings

COMMISSION RECOMMENDATION. of on access to and preservation of scientific information. {SWD(2012) 221 final} {SWD(2012) 222 final}

European Commission. 6 th Framework Programme Anticipating scientific and technological needs NEST. New and Emerging Science and Technology

Advanced Manufacturing and Disruptive Technologies: Implications for Strategic Competitiveness

UNIVERSITY OF BRITISH COLUMBIA Department of Curriculum and Pedagogy Winter I 2009

From Observational Data to Information IG (OD2I IG) The OD2I Team

History and Perspective of Simulation in Manufacturing.

Finland s drive to become a world leader in open science

EU Research Integrity Initiative

Warm Up. 1 Use your ipad to research The Gilded Age

Course Unit Outline 2017/18

Introduction to Computational Intelligence in Healthcare

Opening Science & Scholarship

The Role of Effec,ve Intellectual Property Asset Management in Enhancing the Compe,,veness of SMEs

Horizon Scanning. Why & how to launch it in Lithuania? Prof. Dr. Rafael Popper

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Convergence of Knowledge, Technology, and Society: Beyond Convergence of Nano-Bio-Info-Cognitive Technologies

Level Below Basic Basic Proficient Advanced. Policy PLDs. Cognitive Complexity

Cognitive Augmentation Languages for Collective Intelligence and Human Development

Vermeulen, Niki. Plant Elicitors as Bio-Objects. Social Epistemology Review and Reply Collective 5, no. 8 (2016): 1-4.

APEC Internet and Digital Economy Roadmap

Agreement Technologies Action IC0801

Global Alliance for Genomics & Health Data Sharing Lexicon

CESGO. O.Collin / C. Monjeaud

Text Mining for Historical Documents Motivation and Case Studies

An Introduction to Agent-based

Open Science for the 21 st century. A declaration of ALL European Academies

Our responses are interleaved with the questions that were posed in your request for feedback.

THE ROLE OF USER CENTERED DESIGN PROCESS IN UNDERSTANDING YOUR USERS

RECOMMENDATIONS. COMMISSION RECOMMENDATION (EU) 2018/790 of 25 April 2018 on access to and preservation of scientific information

Journal Title ISSN 5. MIS QUARTERLY BRIEFINGS IN BIOINFORMATICS

Exploring the New Trends of Chinese Tourists in Switzerland

SHORT SUMMARY REPORT OF THE WORKSHOP ON GENETIC INVENTIONS, INTELLECTUAL PROPERTY RIGHTS AND LICENSING PRACTICES

TOWARDS AN ARCHITECTURE FOR ENERGY MANAGEMENT INFORMATION SYSTEMS AND SUSTAINABLE AIRPORTS

Swiss Re Institute. September 2018 Dr. Jeffrey R. Bohn

A two-year pilot project

Crossref 2016 Board Election Candidate Statements

Why Are Data Sharing and Reuse So Difficult?

Information Communication Technology

Reason and imagination are fundamental to problem solving and critical examination of self and others.

Don R. Swanson Impact on Information Science

Why do we need standards?

Model Based Systems Engineering

UCLA Presentations. Title. Permalink. Author. Publication Date. If Data Sharing is the Answer, What is the Question?

Academies outline principles of good science publishing

Open Philosophies for Associative Autopoietic Digital Ecosystems

Ethical Governance Framework

What is Computation? Biological Computation by Melanie Mitchell Computer Science Department, Portland State University and Santa Fe Institute

Connecting Museum Collections with the Rest of the World

Digital Health. Jiban Khuntia, PhD. Assistant Professor Business School University of Colorado Denver

e-science Acknowledgements

Gardens, Libraries and Museums. Digital Strategy Termly Update, June 2018

The future role of libraries in the information age

Information Visualizations that Improve Access to Scholarly Knowledge and Expertise

Advanced Cyberinfrastructure for Science, Engineering, and Public Policy 1

Organised by Science Europe and the Netherlands Organisation for Scientific Research (NWO) Brussels, 30 January 2018

THE VIRTUALISATION OF OBJECTS. Museums Australia Registrars Committee. Museums Australia 1996 National Conference.

1 Name of Course Module: History and Philosophy of Science-2. 2 Course Code: 3 Name(s) of academic staff: Prof. C. K. Raju

155 ISSN ; ISBN

Goals of the AP World History Course Historical Periodization Course Themes Course Schedule (Periods) Historical Thinking Skills

Environmental Science: Your World, Your Turn 2011

INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 03 STOCKHOLM, AUGUST 19-21, 2003

e-infrastructures for open science

From a practical view: The proposed Dual-Use Regulation and Export Control Challenges for Research and Academia

The Importance of Scientific Reproducibility in Evidence-based Rulemaking

Transcription:

Philosophy of data- intensive science Sabina Leonelli Department of Sociology, Philosophy and Anthropology & Egenis University of Exeter

Data- intensive science: A new paradigm? New technologies for the produccon, storage and disseminacon of data: compucng power is seen as transforming how science is done, but no coherent and systemacc assessment of such transformacon to date How is science changing to take advantage of digital technologies for data disseminacon? With which implicacons? Long history of data colleccon and sharing in science: what is new today, and how do these praccces differ from other forms of sciencfic inquiry? What can and cannot be learnt from big data, and how? Can science be data- driven? How can the quality, relevance and reliability of data be assessed?

Possible epistemic drawbacks: fer;le terrain for philosophical inves;ga;on InformaCon overload versus interpretacon and synthesis: what are we actually learning from big data? Issues with standards: can they be trusted? Who develops them and how? Danger of conserva;sm (available data are favoured) Issues with quality controls and peer review (burden for peers, unclear status of data in the process, reproducibility) New opportunices for fraud (bad data, digital manipulacon of evidence, plagiarism)

How are data actually disseminated and re- used? Focus on data journeys Understanding how data are actually circulated and used is key to understanding what counts as sciencfic knowledge in the digital era, including what counts as evidence, theory and experiment Data travel requires work, including significant conceptual and material scaffolding that then affects further research; need for intelligent ways to make data open (Royal Society 2012) Understanding contexts / domains in which data acquire evidencal value is crucial Hence: focus on use of online databases to make data travel

My Empirical Work Methods - Empirically grounded philosophy of science: following the data, archival research, interviews, policy engagement on open science and collabora7on with curator and user communices Focus - Model organism research: bringing together various types of data on the same organism [e.g. community databases ]; increasingly serving also cross- species and translaconal research Leonelli, S. (2013) IntegraCng Data to Acquire New Knowledge: Three Modes of IntegraCon in Plant Science. Studies in the History and Philosophy of the Biological and Biomedical Sciences. Leonelli, S. and Ankeny, R.A. (2012) Re- Thinking Organisms: The Epistemic Impact of Databases on Model Organism Biology. Studies in the History and Philosophy of the Biological and Biomedical Sciences. Leonelli, S. (2010) Packaging Data for Re- Use: Databases in Model Organism Biology. In Howleb, P. and Morgan, M.S. (eds) How Well Do Facts Travel? The Dissemina@on of Reliable Knowledge. Cambridge University Press Key difficulty in these areas: pluralism (no centralisacon of experiments and data formats)

Model Organism Databases: Defining Standards for Collec;on, Dissemina;on and Interpreta;on of Data on Organisms

our goal is to provide the common vocabulary, visualisacon tools, and informacon retrieval mechanisms that permit integracon of all knowledge about Arabidopsis into a seamless whole that can be queried from any perspeccve

The Gene Ontology formal representacons of areas of knowledge in which the essencal terms are combined with structuring rules that describe the rela@onship between the terms. Knowledge that is structured in a bio- ontology can then be linked to the molecular databases Precisely defined, descrip@ve terms Precisely defined rela@ons among terms AssociaCon of terms with datasets Result: network of interdependent claims about phenomena

Transforming data into knowledge: Stages of data journeys (1) De- contextualisa;on: making data travel across research contexts [Temporary] separacon of data from informacon about their provenance. This requires adequate standards and guidelines for data formaeng. (2) Re- contextualisa;on: assessing data quality and reliability Meta- data: adding informacon about provenance enables re- contextualisacon of data produccon Efficient meta- data presuppose reliable reference to material specimens (e.g. strains in stock centres), experimental protocols, instruments and calibracon techniques (3) Re- use: using data towards discovery No simple induccon / automated reasoning : data interpretacon involves reference to theories embedded in specific praccces

Results What counts as good data in model organism biology? - Depends on experimental standards - Serious disagreements and diversity across subfields Data classificacon as a theory- making accvity (e.g. bio- ontologies) Understanding of data re- use (feeding into policy discussions of Open Science) Reconceptualising the organism Reconceptualising knowledge produccon: Comparison of alternacve ways to organise data is key to further understanding and exploracon of significance of data QuesConing the reach of the Fourth Paradigm

Key PublicaCons Leonelli, S. (accepted) Data InterpretaCon in the Digital Age. Perspec@ves on Science. Leonelli, S. (2013) IntegraCng Data to Acquire New Knowledge: Three Modes of IntegraCon in Plant Science. Studies in the History and Philosophy of the Biological and Biomedical Sciences: Part C. Online First. Leonelli, S. (2012) Classificatory Theory in Biology. Biological Theory, 7(1). Online First. Leonelli, S. (2012) Classificatory Theory in Data- Intensive Science: The Case of Open Biomedical Ontologies. Interna@onal Studies in the Philosophy of Science 26(1): 47-65. Leonelli, S. (2012) When Humans Are the ExcepCon: Cross- Species Databases at the Interface of Clinical and Biological Research. Social Studies of Science 42(2): 214-236. Leonelli, S. (2012) Making Sense of Data- Driven Research in the Biological and the Biomedical Sciences. Studies in the History and Philosophy of the Biological and Biomedical Sciences 43(1): 1-3. Leonelli, S. and Ankeny, R.A. (2012) Re- Thinking Organisms: The Epistemic Impact of Databases on Model Organism Biology. Studies in the History and Philosophy of the Biological and Biomedical Sciences 43(1): 29-36. Leonelli, S., Diehl, A.D., ChrisCe, K.R., Harris, M.A. and Lomax, J. (2011) How the Gene Ontology Evolves. BMC Bioinforma@cs, 12:325 (tagged highly accessed ). Ankeny, R.A. and Leonelli, S. (2011) Bioethics Authorship in Context: How Trends in Biomedicine Challenge Bioethics. The American Journal of Bioethics, 11(10): 22-24. Bastow, R. and Leonelli, S. (2010) Sustainable digital infrastructure. EMBO Reports, 11(10): 730-735. Leonelli, S. (2010) Machine Science: The Human Side. Science, 330 (6002): 317. Leonelli, S. (2010) DocumenCng the Emergence of Bio- Ontologies: Or, Why Researching BioinformaCcs Requires HPSSB. History and Philosophy of the Life Sciences, 32, 1: 105-126. Leonelli, S. (2010) Packaging Data for Re- Use: Databases in Model Organism Biology. In Howleb, P. and Morgan, M.S. (eds) How Well Do Facts Travel? The DisseminaCon of Reliable Knowledge. Cambridge University Press, pp.325-348. Leonelli, S. (2010) The CommodificaCon of Knowledge Exchange: Governing the CirculaCon of Biological Data. In: Radder, H. (ed) The CommodificaCon of Academic Research: Science and the Modern University. Pibsburgh UP, pp.132-157. Leonelli, S. (2009) Centralising Labels to Distribute Data: The Regulatory Role of Genomic ConsorCa. In Atkinson, P., Glasner, P. and Lock, M. (eds) The Handbook for GeneCcs and Society: Mapping the New Genomic Era. Routledge, pp. 469-485. Leonelli, S. (2009) On the Locality of Data and Claims About Phenomena. Philosophy of Science, 76, 5: 737-749. Leonelli, S. (2008) Bio- Ontologies as Tools for IntegraCon in Biology. Biological Theory, 3, 1: 8-11.