Big Data, Little Data, or No Data? ischools, Scholarship, and Stewardship

Similar documents
Why Are Data Sharing and Reuse So Difficult?

Keynote: Data, Data, Everywhere, Nor Any Drop to Drink (slides)

UCLA Presentations. Title. Permalink. Author. Publication Date. If Data Sharing is the Answer, What is the Question?

Data Scholarship in the Humanities

Building an Infrastructure for Data Science Data and the Librarians Role. IAMSLIC, Anchorage August, 2012 Linda Pikula, NOAA and IODE GEMIM

SEMINAR: Preparing research data for open access

International Symposium on Knowledge Communities 2012

University of Southern California Guidelines for Assigning Authorship and for Attributing Contributions to Research Products and Creative Works

The Scholarly Communication Cycle and Research Data

Enabling FAIR Data in the Earth, Space, and Environmental Sciences

Open Science. challenge and chance for medical librarians in Europe.

Disciplines, Documents, and Data: Roles for Research Libraries in e-research

STRATEGIC FRAMEWORK Updated August 2017

Researchers and new tools But what about the librarian? mendeley.com

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

RESEARCH DATA MANAGEMENT PROCEDURES 2015

Keynote Address: "Local or Global? Making Sense of the Data Sharing Imperative"

Open Science policy and infrastructure support in the European Commission. Joint COAR-SPARC Conference. Porto, 15 April 2015

Data, data use, and scientific inquiry: Two case studies of data practices

For more information about how to cite these materials visit

RECOMMENDATIONS. COMMISSION RECOMMENDATION (EU) 2018/790 of 25 April 2018 on access to and preservation of scientific information

Digitisation Plan

Project Title: Submitter: Team Problem Statement

Computational Reproducibility in Medical Research:

SHARE UPDATE. CNI Membership Meeting, 8 December 2014

Why? A Documentation Consortium Ted Habermann, NOAA. Documentation: It s not just discovery... in global average

University of Kansas. The University of Kansas Libraries

SI Digital Libraries, Winter 2008

Project Title: Submitter: Team Problem Statement

Policy Contents. Policy Information. Purpose and Summary. Scope. Published on Policies and Procedures (

The Intel Science and Technology Center for Pervasive Computing

The modern global researcher:

A F.A.I.R. model for Australia s research outputs: emerging policies and new strategies

Evolution of Data Creation, Management, Publication, and Curation in the Research Process

Emerging Standards: Data and Data Exchange in Scholarly Publishing. Council of Science Editors, Denver, 2016

Office of Science and Technology Policy th Street Washington, DC 20502

INTELLECTUAL PROPERTY POLICY

New forms of scholarly communication Lunch e-research methods and case studies

F98-3 Intellectual/Creative Property

Technology forecasting used in European Commission's policy designs is enhanced with Scopus and LexisNexis datasets

Introduction to Data- PASS

ADVANCING KNOWLEDGE. FOR CANADA S FUTURE Enabling excellence, building partnerships, connecting research to canadians SSHRC S STRATEGIC PLAN TO 2020

KU Libraries Digital Data Services Strategy

S E R B A N I O N E S C U M. D. P H. D. U N I V E R S I T É P A R I S 8 U N I V E R S I T É D U Q U É B E C À T R O I S - R I V I È R E S

Coming Out. Making the Virtual Library Visible in Today s World. Dr Grace Saw and Janine Schmidt

Open Science in the Digital Single Market

Talk: How companies successfully partner with Irish Universities - Industry Engagement at Trinity College Dublin

Data-intensive environmental research: re-envisioning science, cyberinfrastructure, and institutions

COMMISSION RECOMMENDATION. of on access to and preservation of scientific information. {SWD(2012) 221 final} {SWD(2012) 222 final}

Interagency Collaboration: Barriers / Solutions

Next generation research evaluation:!!!!!!!!!!! the ACUMEN Portfolio and web based information tools

ALA s Core Competences of Librarianship

Dr. Paul Ayris Pro-Vice-Provost (UCL Library Services) Co-Chair of the LERU INFO Community Adviser to the LIBER Board

The Stewardship Gap INTRODUCTION

A Journal for Human and Machine

Open Science at Web-Scale: Breaking

Technology Commercialization Primer: Understanding the Basics. Leza Besemann

FACULTY OF ENGINEERING & INFORMATION TECHNOLOGIES RESEARCH DATA MANAGEMENT PROVISIONS 2015

Center for Open Data in the Humanities (CODH): Activities and Future Plans

Berkeley Postdoc Entrepreneur Program (BPEP)

PLOS. From Open Access to Open Science : a publisher s perspective. Véronique Kiermer Executive Editor, PLOS Public Library of Science.

Increased Visibility in the Social Sciences and the Humanities (SSH)

Open IP Workgroup Report

Library Special Collections Mission, Principles, and Directions. Introduction

THE ATLAS OF NEW LIBRARIANSHIP

Continuity and change Opportunities and challenges for the future of research libraries in a data-intensive age

Hacking the Web of Science data? From bibliometric projects to researcher portals

Science of Science & Innovation Policy (SciSIP) Julia Lane

Opening Science & Scholarship

California State University, Northridge Policy Statement on Inventions and Patents

CONSIDERATIONS REGARDING THE TENURE AND PROMOTION OF CLASSICAL ARCHAEOLOGISTS EMPLOYED IN COLLEGES AND UNIVERSITIES

The Specimen Case and the Garden: Preserving Complex Digital Objects, Sustaining Digital Projects

SCHEDULE OF FINANCIAL DELEGATIONS OF AUTHORITY

TU Delft Research Data Framework Policy

Loyola University Maryland Provisional Policies and Procedures for Intellectual Property, Copyrights, and Patents

Cultural Shift: Innovation is a Process

The European Approach


What is a collection in digital libraries?

DRM vs. CC: Knowledge Creation and Diffusion on the Internet

Innovation & Knowledge management

Principles for the Networked World

Lewis-Clark State College No Date 2/87 Rev. Policy and Procedures Manual Page 1 of 7

Trusted Data Intermediaries

A conversation with David Jay on 03/14/13

Intellectual Property

PLOS. Open Science at PLOS. Open Access Week, October Nicola Stead, Senior Editor, PLOS ONE

SERBIA. National Development Plan. November

TeesRep policy document

Design and Development of Information System of Scientific Activity Indicators

Living on the LAM: Libraries, Archives and Museums in the Digital Age

WHEREAS, UCMERI requires additional financial support to sustain its operations; and

Raising OER Awareness:

Open Science for the 21 st century. A declaration of ALL European Academies

FRANCES M. PANTALEO, ESQ.

Libraries and IT: Services Supporting Research at NC State Jill Sexton Interim Associate Director for the Digital Library NCSU Libraries April 25,

RecordDNA DEVELOPING AN R&D AGENDA TO SUSTAIN THE DIGITAL EVIDENCE BASE THROUGH TIME

Intellectual Property Ownership and Disposition Policy

Data the NIH: What is Happening & What is Coming: A Conversation

Earth Cube Technical Solution Paper the Open Science Grid Example Miron Livny 1, Brooklin Gore 1 and Terry Millar 2

Transcription:

Big Data, Little Data, or No Data? ischools, Scholarship, and Stewardship Christine L. Borgman Distinguished Professor & Presidential Chair in Information Studies Director, Center for Knowledge Infrastructures https://knowledgeinfrastructures.gseis.ucla.edu University of California, Los Angeles http://christineborgman.info @scitechprof Inaugural ischool Lecture Linnaeus University, Växjö, Sweden 7 May 2018 MIT Press, 2015

Theme issue Celebrating 350 years of Philosophical Transactions: life sciences papers compiled and edited by Linda Partridge 19 April 2015; volume 370, issue 1666

Data 3

Data sharing policies European Union U.S. Federal research policy Research Councils of the UK Australian Research Council Individual countries, funding agencies, journals, universities 4

Precondition: Researchers share data 5

Big Data http://www.datameer.com/product/hadoop.html 6

What are data? Marie Curie s notebook aip.org Pisa Griffin hudsonalpha.org http://www.census.gov/population/cen2000/map02.gif ncl.ucar.edu http://onlineqda.hud.ac.uk/intro_qda/examples_of_qualitative_data.php 7

Data are representations of observations, objects, or other entities used as evidence of phenomena for the purposes of research or scholarship. http://www.genome.gov/dmd/img.cfm?node=photos/graphics&id=85327 C.L. Borgman (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press 8

Research process Models and theories Research questions Methods Domain expertise Practices, protocols Data sources Instruments, software Infrastructure Commons photo: Science Gossip, 1894 9

Telescope for the Sloan Digital Sky Survey, Apache Point, New Mexico 10

11

Center for Embedded Networked Sensing NSF Science & Tech Ctr, 2002-2012 5 universities, plus partners 300 members Computer science and engineering Science application areas Slide by Jason Fisher, UC-Merced, Center for Embedded Networked Sensing (CENS) 12

Science < > Data Engineering researcher: Temperature is temperature. CENS Robotics team Biologist: There are hundreds of ways to measure temperature. The temperature is 98 is low-value compared to, the temperature of the surface, measured by the infrared thermopile, model number XYZ, is 98. That means it is measuring a proxy for a temperature, rather than being in contact with a probe, and it is measuring from a distance. The accuracy is plus or minus.05 of a degree. I [also] want to know that it was taken outside versus inside a controlled environment, how long it had been in place, and the last time it was calibrated, which might tell me whether it has drifted.."

http://vcg.isti.cnr.it/griffin/ Arte islamica, ippogrifo, XI sec 03, own work 14

Publications http://www.cse.psu.edu/hpcl/images/publications.jpg 15

Grey Literature Learning management systems University ID cards: library, health, recreation, dorms, food service, transportation Academic personnel dossiers Staff surveys Sensor networks Security cameras Network traffic Street traffic Bus traffic Reports Working papers Conference papers Preprints Patents Datasets Audio Video Slides Posters Codebooks Course syllabi Proposals Memos http://www.greynet.org/ 16

Grey Data Stuart Miles: FreeDigitalPhotos.net Student applications Registrar records Learning management systems University ID cards: library, health, recreation, dorms, food service, transportation Academic personnel dossiers Regulation and compliance data Staff surveys Sensor networks Security cameras Network traffic Street traffic https://www.linkedin.com/pulse/hipaa-privacy-rulecompliance-understanding-new-rules-syed-najaf Borgman, C. L. (2018). Open Data, Grey Data, and Stewardship: Universities at the Privacy Frontier. Berkeley Technology Law Journal, 33(2). https://arxiv.org/abs/1802.02953 http://www.aetc.af.mil/news/article-display/article/559551/think-before-sending-protecting-pii/ 17

Networks of data http://humannaturelab.net/wp-content/uploads/2015/01/fig1-no-text-village-2-only-selection.png 18

Publications < > Data: Role Publications are arguments made by authors, and data are the evidence used to support the arguments. C.L. Borgman (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. MIT Press

Publications < > Data: Mapping Article 1 Article 2 Article 3 Article 4 Article n Dataset time 1 Dataset time 2 Observation time 1 Visualization time 3 Community collection 1 Repository 1

Publications < > Data: Attribution Publications Independent units Authorship is negotiated Data Compound objects Ownership is rarely clear Attribution Long term responsibility: Investigators Expertise for interpretation: Data collectors and analysts http://www.genome.gov/dmd/img.cfm?node=photos/graphics &id=85327

Data citation and analytics Credit Attribution Discovery

Bibliometrics, Scientometrics, Informetrics, Webometrics Ohm, P. (2010). Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. UCLA Law Review, 57, 1701. Borgman, C. L. (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. Cambridge MA: MIT Press.

Bibliographic styles 1797 unique styles (27 Feb 2018)

Published July 23, 2013; screenshot Feb 27, 2018 Altmetrics

Bibliometrics by Source Searches for author: Christine Borgman, Christine L. Borgman, CL Borgman (excluding other C Borgman authors) on July 28, 2014 and February 25, 2016 for Google Scholar, Web of Science, Scopus UCLA cancelled Scopus subscription by 2016 Source Publications 2014 2016 Citations received 2014 2016 H-index 2014 2016 Google Scholar (Google) Web of Science (Thomson-Reuters) Scopus July 2014 (Elsevier) 380 443 7766 9701 39 43 145 150 1629 1967 20 23 77 1314 14 (after 1995) 26

Attributing responsibility Legal responsibility Licensed data Specific attribution required Scholarly credit: contributorship Author of data Contributor of data to this publication Colleague who shared data Software developer Data collector Instrument builder Data curator Data manager Data scientist Field site staff Data calibration Data analysis, visualization Funding source Data repository Lab director Principal investigator University research office Research subjects Research workers, e.g., citizen science For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, D.C.: The National Academies Press. 2012 27

Discovery and Interpretation Identify the form and content Identify related objects Interpret Evaluate Open Read Compute upon Reuse Combine Describe Annotate Photo by @kissane; presentation by Jason Scott (@textfiles) 28

Identity and persistence Identity Identifiers DOI, Handles URI, PURL Naming and namespaces Authors/creators: ORCID, ISNI, VIAF Generic/specific: registry number Description Self-describing Metadata augmentation Persistence Perishable Long-lived Permanent http://web-interviewquestions.blogspot.com/2010_06_21_archive.h 29 tml

Intellectual property What can I do with this object? What rights are associated? Reuse Reproduce Attribute Who owns the rights? How open are data? Open data Open bibliography 30 http://pzwart.wdka.hro.nl/mdr/research/lliang/mdr/mdr_images/opencontent.jpg/

Information and Autonomy Privacy UCOP Privacy and Information Security Initiative. (2013). http://ucop.edu/privacy-initiative/ 31

Data Stewardship: The Ideal https://wwwdb.inf.tu-dresden.de/opendatasurvey/ Wilkinson, et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, http://dx.doi.org/10.1038/sdata.2016.18 32

Data Stewardship: the Reality http://www.datamartist.com/data-migration-part-1-introduction-to-the-data-migration-delema Getty Research Institute Mount Wilson Solar Observatory, 2017 http://gsa.rice.edu/ Graduate students http://www.information-age.com/cloudcomputing-pharmaceutical-industry-123462676/ https://med.nyu.edu/our-community/lifenyu-school-medicine/life-postdoc NASA, Cape Canaveral, http://www.loc.gov/pictures/resource/hhh.fl0 33 83.photos.319101p/ Post-doctoral fellows

Data If you can t protect it, don t collect it. (privacy and security aphorism) Therefore: If you collect it, you must protect it. 34

Protect Data and Privacy http://democracyos.eu/blog/open-by-design https://wwwdb.inf.tu-dresden.de/opendatasurvey/ Wilkinson, et al. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, http://dx.doi.org/10.1038/sdata.2016.18 https://privacybydesign.foundation/en/ 35

Protect Data and Privacy https://github.com/okulbilisim/ awesome-datascience The DCC Curation Lifecycle Model www.dcc.ac.uk info@dcc.ac.uk 36

Promote Responsible Data Practices Respect information and autonomy privacy Open data: release and reuse Data collection and use Data management Collaborations Publications Community Faculty Librarians Staff Students External partners Joint governance process http://www.berkeley.edu/utility/jobs https://www.universityofcalifornia.edu/subject/term/techn ology-engineering http://gsa.rice.edu/ https://www.commondreams.org/views/2014/0 37 9/20/corporations-your-diet http://volunteer.ucla.edu/wp-content/uploads/2011/09/volunteer_day_2011-unionrescue-prv.jpg

Scholarship and Stewardship in Practice Mission-driven stewardship Research Teaching Services Steward the scholarly record Integrated workflows Version of record Record of versions (Van de Sompel) Support discovery at scale Human readable Machine readable Lawyer readable Sustain trust of community Privacy: information, autonomy Academic freedom Stewardship and governance 38

Acknowledgements UCLA Center for Knowledge Infrastructures Christine Borgman Peter Darch Irene Pasquetto Bernie Boscoe Michael Scroggins Milena Golshan

UC Leadership in Data Policy We must maximally enable the mission of the University by supporting the values of academic and intellectual freedom. We must be good stewards of the information entrusted to the University. We must ensure that the University has access to information resources for legitimate business purposes. We must have a University community with clear expectations of privacy both privileges and obligations of individuals and of the institution. We must make decisions within an institutional context. We must acknowledge the distributed nature of information stewardship at UC, where responsibility for privacy and information security resides at every level. UCOP Privacy and Information Security Initiative. (2013). http://ucop.edu/privacy-initiative/ 40