Data the NIH: What is Happening & What is Coming: A Conversation

Similar documents
Opening Science & Scholarship

Libraries on the Cutting Edge: The Evolution of The Journal of escience Librarianship

biomedical and healthcare Data Discovery Index Ecosystem

Why? A Documentation Consortium Ted Habermann, NOAA. Documentation: It s not just discovery... in global average

Banning Garrett, PhD

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Technological Innovation : Open Innovation

Introduction. digitalsupercluster.ca

STRATEGIC FRAMEWORK Updated August 2017

Mission Space. Value-based use of augmented reality in support of critical contextual environments

Keynote Address: "Local or Global? Making Sense of the Data Sharing Imperative"

What is a collection in digital libraries?

UNIT 2 TOPICS IN COMPUTER SCIENCE. Emerging Technologies and Society

RECOMMENDATIONS. COMMISSION RECOMMENDATION (EU) 2018/790 of 25 April 2018 on access to and preservation of scientific information

Digitisation Plan

Open Science and e-infrastructure

MRC Health and Biomedical Informatics Research Strategy

SHARE UPDATE. CNI Membership Meeting, 8 December 2014

Computational Reproducibility in Medical Research:

APEC Internet and Digital Economy Roadmap

Corporate Mind 2013 Corporate Responsibility Report

RESEARCH DATA MANAGEMENT PROCEDURES 2015

Managing Intellectual Property Assets: The NIH OTT Perspective

Enabling ICT for. development


Hacking the Web of Science data? From bibliometric projects to researcher portals

Job Title: DATA SCIENTIST. Location: Champaign, Illinois. Monsanto Innovation Center - Let s Reimagine Together

Michael P. Ridley, Director. NYSTAR High Performance Computing Program

The Learning Health System: Visions of the Present and Future. Charles P. Friedman, PhD University of Michigan NSF Workshop April 11-12, 2013

Open Data, Open Science, Open Access

Artificial Intelligence

2018 NISO Calendar of Educational Events

Open Science in the Digital Single Market

President Barack Obama The White House Washington, DC June 19, Dear Mr. President,

HEALTH-RI THE NETHERLANDS

Demystifying Creative Commons. Open textbooks are typically copyrighted as one of these:

g~:~: P Holdren ~\k, rjj/1~

Open Repositories 2017 Isomorphic Pressures on Institutional Repositories in Japan

Open Science. challenge and chance for medical librarians in Europe.

Great Minds. Internship Program IBM Research - China

The Sustainability Innovation Network: A Proposal for Promoting Sustainability Through Networks of Open Innovation

Goals Planned Outcomes & Benefits Who Chairs:

Signature Initiatives Working Group Draft Report Appendix A5

Scripps Florida. Accelerating Discoveries, Saving Lives. Presentation to the Urban Land Institute November 4, 2011

Innovative Business Incubation Foster the Growth of Technology

VIVO + ORCID = a collaborative project

clarification to bring legal certainty to these issues have been voiced in various position papers and statements.

INVESTING IN AMERICAN UNIVERSITY OF BEIRUT AMERICAN UNIVERSITY OF BEIRUT

Guidelines for the Professional Evaluation of Digital Scholarship by Historians

John Weaver, PhD AIM Scientific Core Technical Director. Larry Sklar, PhD Autophagy Scientific Core Director

Engineering Grand Challenges. Information slides

A Journal for Human and Machine

Human Rights Approach

Tutorial: Open Data. Open Source EHR Summit & Workshop October 17-18, 2012 National Harbor, MD

DRM vs. CC: Knowledge Creation and Diffusion on the Internet

Enabling Science, Technology & Innovation For National Security

University of Southern California Guidelines for Assigning Authorship and for Attributing Contributions to Research Products and Creative Works

Big Data Analytics in Science and Research: New Drivers for Growth and Global Challenges

WFEO STANDING COMMITTEE ON ENGINEERING FOR INNOVATIVE TECHNOLOGY (WFEO-CEIT) STRATEGIC PLAN ( )

Advances and Perspectives in Health Information Standards

Project Title: Submitter: Team Problem Statement

The Michigan Institute for Computational Discovery & Engineering 2017 Catalyst Grants Informational Session November 29, 2017

Tony Vanchieri, Luke Sebby and Gary Dooley

Research Content, Workflows and Beyond. Lim Kok Keng

Fujitsu Technology and Service Vision Executive Summary

Enabling FAIR Data in the Earth, Space, and Environmental Sciences

Technology Transfer: Working with Industry at MIT. 10 February 2009 Kenneth A. Goldman Manager, Corporate Relations MIT Industrial Liaison Program

Frequently Asked Questions

Data and Knowledge as Infrastructure. Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation

Some Research Trends: おはようございます. Outline:

PLOS. Open Science at PLOS. Open Access Week, October Nicola Stead, Senior Editor, PLOS ONE

Open Science and Research Initiative Infrastructures and networking for Open Science Seminar on at the University of Helsinki

FAQ. What is OIX? Who is leading OIX?

Digital Preservation Program: Organizational Policy Framework (06/07/2010)

THE BIOMEDICAL ENGINEERING TEACHING & INNOVATION CENTER. at Boston University s College of Engineering

Driving the Future of Digital Experiences Silvia Boi, Jean Dominique Meunier NEM Executive Board Member

COURSE 2. Mechanical Engineering at MIT

For more information about how to cite these materials visit

HUMANITIES, ARTS & CULTURE DATA SUMMIT. Rachel Fensham Digital Studio, University of Melbourne

New Directions in Digital Library Publishing: Increasing Access to Non-Textual Cultural Narratives

RESOLUTION NO xxx

9 th AU Private Sector Forum

EU RESEARCH Nanotechnologies and Advanced Materials and beyond. Safe Nanotechnology. Dr. Georgios Katalagarianakis European Commission

Researchers and new tools But what about the librarian? mendeley.com

Project Title: Submitter: Team Problem Statement

Beyond the Smart City: Towards an open, equitable, democratic and circular City

Research and Innovation Strategy and Action Plan UPDATE Advancing knowledge and transforming lives through education and research

Capability to Transform Care Delivery

Press Release - September 27, 2013

1. Digital Humanities in the Library: Challenges and Opportunities for Subject Specialists. Copyright 2015 by The Association of College & Research

Intellectual Property

2018 NISO Calendar of Educational Events

Earth Cube Technical Solution Paper the Open Science Grid Example Miron Livny 1, Brooklin Gore 1 and Terry Millar 2

Strategic Planning Framework

Research Trends in NSF and JST-NSF Collaboration Opportunities

The Tech Megatrends: 2018

Open Science policy and infrastructure support in the European Commission. Joint COAR-SPARC Conference. Porto, 15 April 2015

Report from the Usage Dimensions of Open Workgroup

International Symposium on Knowledge Communities 2012

The future of Research Universities in Asia: Reading the water well AND creating exciting new streams

Transcription:

University of Massachusetts Medical School escholarship@umms University of Massachusetts and New England Area Librarian e-science Symposium 2015 e-science Symposium Apr 9th, 9:15 AM Data Science @ the NIH: What is Happening & What is Coming: A Conversation Philip E. Bourne National Institutes of Health Follow this and additional works at: http://escholarship.umassmed.edu/escience_symposium Part of the Public Health Commons, Scholarly Communication Commons, and the Science and Technology Policy Commons This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License. Bourne, Philip E., "Data Science @ the NIH: What is Happening & What is Coming: A Conversation" (2015). University of Massachusetts and New England Area Librarian e-science Symposium. 5. http://escholarship.umassmed.edu/escience_symposium/2015/program/5 This material is brought to you by escholarship@umms. It has been accepted for inclusion in University of Massachusetts and New England Area Librarian e-science Symposium by an authorized administrator of escholarship@umms. For more information, please contact Lisa.Palmer@umassmed.edu.

Data Science @ the NIH What is Happening & What is Coming A Conversation Philip E. Bourne, PhD, FACMI Associate Director for Data Science National Institutes of Health March 31, 2015

This is Just the Beginning Evidence: Google car 3D printers Waze Robotics Sensors From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee

Addressing the Opportunities & Challenges 6/12 2/14 3/14 Findings: Sharing data & software through catalogs Support methods and applications development Need more training Need campus-wide IT strategy Hire CSIO Continued support throughout the lifecycle

What Have I Learned Thus Far?. Working with the full spectrum of data types is challenging Xtreme translation A large ship takes a long time to stop and turn, but a great crew helps That crew is in places I was not used to There are complexities I could not have imagined going in based on the funding ecosystem

What Have I Learned Thus Far? Policies take time when they come from the bottom up, but they may work are i.e. implemented and adhered to Policies from the top down can be problematic What you set out to do is often not what you end up doing e.g. precision medicine, NLM rethink This is just the beginning

Additional NIH Disruptors

Additional NIH Disruptors NLM 15 Year Vision Statement (Personal View) The National Biomedical Knowledge Portal is a community resource dedicated to the preservation, free, open and collaborative access to the world s biomedical research output. http://www.plexsci.com/assets/files/1/images/knowledge-management-.jpg

Early Findings Bad News We do not yet have a data sustainability plan Global policies define the why but not the how We do not know how all the data we currently have are used We need to ramp up training programs in data science Good news Genuine willingness across the IC s to address the problems Global communities are emerging and should be nurtured We are beginning to define & quantify the issues e.g. reproducibility Disruptors accelerate change

Office of Biomedical Data Science Mission Statement To foster an open ecosystem that enables biomedical research to be conducted as a digital enterprise that enhances health, lengthens life and reduces illness and disability & to train the next generation of data scientists Goals expanded from recommendations in the June 2012 DIWG and BRWWG reports.

The BD2K Program is Central to the Mission $120,000,000 Planned Black; Available- Green $100,000,000 $80,000,000 $60,000,000 $40,000,000 $20,000,000 $0 FY14 FY15 FY16 FY17 FY18 FY19 FY20 FY21

Elements of The Digital Enterprise Communities Policies Infrastructure Intersection: Sustainability Efficiency Collaboration Training

Elements of The Digital Enterprise Communities Policies Virtuous Research Cycle Infrastructure Intersection: Sustainability Efficiency Collaboration Training

Consider an example

Big Data: The study involved MRI images & GWAS data from over 30,000 people Collaboration: Data came from many different sights affiliated with the ENIGMA consortium Methods: To homogenize data from different sites, the group designed standardized protocols for image analysis, quality assessment, genetic imputation, and association Found five novel genetic variants Results provided insight into the variability of brain development, and may be applied to study of neuropsychiatric dysfunction

Policies: Now & Forthcoming Data Sharing Genomic data sharing announced Data sharing plans on all research awards Data sharing plan enforcement Machine readable plan Repository requirements to include grant numbers http://www.nih.gov/news/health/aug2014/od-27.htm

Policies - Forthcoming Data Citation Goal: legitimize data as a form of scholarship Process: Machine readable standard for data citation (done) Endorsement of data citation for inclusion in NIH bib sketch, grants, reports, etc. Example formats for human readable data citations Slowly work into NLM/NCBI workflow dbgap in the cloud (soon!)

Infrastructure - The Commons Labs BD2K Center BD2K Center Labs Labs BD2K Center Software BD2K Center Labs DDICC Standards BD2K Center BD2K Center

The Commons Digital Objects (with UIDs) The Commons Search (indexed metadata) Computing Platform Vivien Bonazzi George Komatsoulis

The Commons: Compute Platforms The Commons Conceptual Framework Public Cloud Platforms Google, AWS (Amazon) Microsoft (Azure), IBM, other? Super Computing (HPC) Platforms Traditionally low access by NIH Other Platforms? In house compute solutions Private clouds, HPC Pharma The Broad Bionimbus

[George Komatsoulis] The Commons: Business Model

NIH philip.bourne@nih.gov Turning Discovery Into Health