Life Sciences and Cyberinfrastructure: a perspective from Indiana University

Similar documents
The Innovation Machine and the Role of Research! Infrastructure Investment:! Part 3!

Establishment of a Multiplexed Thredds Installation and a Ramadda Collaboration Environment for Community Access to Climate Change Data

December 10, Why HPC? Daniel Lucio.

TLC 2 Overview. Lennart Johnsson Director Cullen Professor of Computer Science, Mathematics and Electrical and Computer Engineering

BETTER THAN REMOVING YOUR APPENDIX WITH A SPORK: DEVELOPING FACULTY RESEARCH PARTNERSHIPS

CITRIS and LBNL Computational Science and Engineering (CSE)

Scripps Florida. Accelerating Discoveries, Saving Lives. Presentation to the Urban Land Institute November 4, 2011

Proposal Solicitation

Leveraging HPC for Alzheimer s Research and Beyond. Joseph Lombardo Executive Director, UNLV s National Supercomputing Center April 2015

Chapter 1 The Field of Computing. Slides Modified by Vicky Seno

Strategic Planning Framework

e-infrastructures for open science

THIS IS RESEARCH. THIS IS AUBURN RESEARCH.

Post K Supercomputer of. FLAGSHIP 2020 Project. FLAGSHIP 2020 Project. Schedule

PROGRAM ANNOUNCEMENT. New Jersey Institute of Technology. MSPhM Systems Engineering. Newark. Fall 2008

Indiana University PTI Annual Report

Sourcing in Scientific Computing

The Learning Health System: Visions of the Present and Future. Charles P. Friedman, PhD University of Michigan NSF Workshop April 11-12, 2013

TECHNOLOGY, ARTS AND MEDIA (TAM) CERTIFICATE PROPOSAL. November 6, 1999

Brief to the. Senate Standing Committee on Social Affairs, Science and Technology. Dr. Eliot A. Phillipson President and CEO

The New Imperative: Collaborative Innovation. Dr. Anil Menon Vice President, Corporate Strategy IBM Growth Markets

GRADUATE PROGRAMS POSSIBILITY

Knowledge Exchange Strategy ( )

Michael P. Ridley, Director. NYSTAR High Performance Computing Program

escience/lhc-expts integrated t infrastructure

Catapult Network Summary

We provide guidance and services to the campus community that promote health, safety, environmental stewardship, and emergency management.

NRC Workshop on NASA s Modeling, Simulation, and Information Systems and Processing Technology

INVESTING IN AMERICAN UNIVERSITY OF BEIRUT AMERICAN UNIVERSITY OF BEIRUT

Driving Israel s Economy and Helping People Worldwide

Georgia Electronic Commerce Association. Dr. G. Wayne Clough, President Georgia Institute of Technology April 30, 2003

g~:~: P Holdren ~\k, rjj/1~

High Performance Computing Systems and Scalable Networks for. Information Technology. Joint White Paper from the

Victor O. Matthews (Ph.D)

McGILL UNIVERSITY SENATE Memorandum

THE ADVANCED RESEARCH COMPUTING LANDSCAPE IN BRITISH COLUMBIA AND CANADA



NEES CYBERINFRASTRUCTURE: A FOUNDATION FOR INNOVATIVE RESEARCH AND EDUCATION

Establishing a reference framework for assessing the Socio-economic impact of Research Infrastructures

1. Executive Summary. 1 Please cite as: Stewart, C.A.; W.K. Barnett; M.R. Link; G. Shankar; T. Miller; S. Michael; R. Henschel; M.J.

Global Alzheimer s Association Interactive Network. Imagine GAAIN

HAPPY JUNE! QUOTES. Biostatistics and Bioinformatics Department. Biostatistics and Bioinformatics. Inside This Issue

XSEDE at a Glance Aaron Gardner Campus Champion - University of Florida

STRATEGIC PLAN

When Virtual Reality Meets the Classroom:

Multinationals in Israel High-Tech R&D and Manufacturing

Information Technology Assessment. Board Report San Jose Evergreen Community College District December 13, 2011

in which I will not talk about the elephant whale in the room

MANUFACTURING INSTITUTE

PURDUE SCHOOL OF ENGINEERING AND TECHNOLOGY AT IUPUI

University of Queensland. Research Computing Centre. Strategic Plan. David Abramson

Canada s Most Powerful Research Supercomputer Niagara Fuels Canadian Innovation and Discovery

TECHNOLOGY ASSESSMENT STRATEGIC PLAN MISSION STATEMENT VISION STATEMENT

University of Massachusetts Amherst Libraries. Digital Preservation Policy, Version 1.3

Metrology at NRC Canada: An NMI in an RTO Context

FDA Centers of Excellence in Regulatory and Information Sciences

Esri and Autodesk What s Next?

Data and Knowledge as Infrastructure. Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation

OSS for Governance and Public Administration : Strategic role of Universities

Technology Transfer: Working with Industry at MIT. 10 February 2009 Kenneth A. Goldman Manager, Corporate Relations MIT Industrial Liaison Program

Information & Communication Technology Strategy

A Science & Innovation Audit for the West Midlands

DATA AT THE CENTER. Esri and Autodesk What s Next? February 2018

NUS Institute of Systems Science appoints Janet Ang as new Chairperson

High Performance Computing i el sector agro-alimentari Fundació Catalana per la Recerca CAFÈ AMB LA RECERCA

University of Kansas. The University of Kansas Libraries

Finding Aid Eighth President Dr. Michael Vinciguerra ( )

Innovation Economy. Creating the. Dr. G. Wayne Clough President, Georgia Institute of Technology

Wolfpack Investor Network. Investing in Our University s Best and Brightest

AI Frontiers. Dr. Dario Gil Vice President IBM Research

Global Perspectives on Clinical Engineering Trends Yadin David, Ed.D., P.E., C.C.E., FAACE, FAIMBE

Exascale Initiatives in Europe

free library of philadelphia STRATEGIC PLAN

STRATEGIC DIRECTIONS A world-class and innovative information service and learning space

Indiana State University Job Growth Report

Facts and Figures. RESEARCH TEACHING INNOVATION. KIT The Research University in the Helmholtz Association

School of Informatics Director of Commercialisation and Industry Engagement

Where the brightest scientific minds thrive. IMED Early Talent and Post Doc programmes

Building a Cell Ecosystem. David A. Bader

The Biological and Medical Sciences Research Infrastructures on the ESFRI Roadmap

2018 WELD Columbus Leadership Series April

2008 INSTITUTIONAL SELF STUDY REPORT EXECUTIVE SUMMARY

HKU partners with Cyberport to set up HKU x Cyberport Digital Tech Entrepreneurship Platform to co-develop an innovative FinTech ecosystem

Opening Science & Scholarship

Proposer: Peggy Carr, Associate Dean (PO Box , 331 Architecture, Voice: x308, Fax: )

DIGITAL TECHNOLOGIES FOR A BETTER WORLD. NanoPC HPC

The role of prototyping in the overall PRACE strategy

Transferring UCLA discoveries to the public. Kathryn Atchison, DDS, MPH Vice Provost, Associate Vice Chancellor for Research

The Technology Circus: How to Bring it All Together. Alan Tacy Infrastructure Practice Lead

Data Science Initiative Winter Symposium. 5 February Mladen A. Vouk Director. Alyson Wilson Associate Director. Trey Overman Program Manager

Brad Fenwick Elsevier Senior Vice President, Global Strategic Alliances

STRATEGIC FRAMEWORK Updated August 2017

Why? A Documentation Consortium Ted Habermann, NOAA. Documentation: It s not just discovery... in global average

Architecting Systems of the Future, page 1

Imagine your future lab. Designed using Virtual Reality and Computer Simulation

Feature. Accelerate Business Development Contributing to Further Enhance Ophthalmic Treatment in Asia. 2020, our goal is to become #1 in

Earth Cube Technical Solution Paper the Open Science Grid Example Miron Livny 1, Brooklin Gore 1 and Terry Millar 2

CREATING RESILIENT, SUSTAINABLE COMMUNITIES: INVESTING IN CIVIL & ENVIRONMENTAL ENGINEERING EDUCATION

Review of the University vision, ambition and strategy January 2016 Sir David Bell KCB, Vice-Chancellor

Transcription:

Life Sciences and Cyberinfrastructure: a perspective from Indiana University Dr. Craig A. Stewart Fulbright Senior Specialist ZIH, Technische Universität Dresden Associate Vice President, Research & Academic Computing; Chief Operational Officer, Pervasive Technology Labs stewart@iu.edu

Outline Introduction and the situation in Indiana Some life science successes based on IU cyberinfrastructure Basic computer science research IU Cyberinfrastructure It takes more than just good science some thoughts about strategy and execution

What is Cyberinfrastructure? Cyberinfrastructure is a group of high performance computing systems, massive data archives and data resources, visualization systems, advanced instruments, and people all linked together by high speed networks to accomplish tasks and achieve breaktroughs in understanding that would not otherwise be possible What are the life sciences? Biology, organic chemistry, analytic chemistry, many areas of psychology, some areas of geography and geology, environmental sciences but typically not atmospheric sciences/global environmental change

The situation in Indiana Indiana s economy is traditionally based on steel and heavy industry Indiana has been a national leader: personal bankruptcies, mortgage foreclosures, job losses Indiana has a strong tradition in life science industries Since the mid 1990s Indiana has developed a strong presence in Information Technology

Life sciences & Cyberinfrastructure strategy for the State of Indiana The State of Indiana has set a strategy based on a combination of life sciences and information technology (cyberinfrastructure) supported by the State Government, private industry, public consortia (e.g. Biocrossroads.org), private charitable trusts, and the State s main universities. Indiana University set a strategic goal in the mid 1990s to become a leader in absolute terms in the development, deployment, and use of information technology IU has long tradition in life sciences Indiana University has most recently created a life science strategic plan, and identified Life Sciences and Information Technology as the two leading goals for the University

Why Life Sciences and Cyberinfrastructure? Growth of basic scientific data Possibilities of developing new medical therapies ($!) Growth area for high performance computing Need for management of clinical data (Baycol example). If you are taking any medication, you are part of an experiment! http://www.ncbi.nlm.nih.gov/genbank/genbankstats.html

So why isn t this easy? A large portion of the High Performance Computing (HPC) community knows relatively little about life sciences A large portion of the global life sciences research community knows relatively little about HPC Real need for scientific advancement in theory and in data In the land of the blind. So the real challenge for the HPC and life science communities is to learn to work together for mutual benefit and to improve the quality of life of those who pay the bills. The key for HPC centers: working with real users to deliver real innovation and results

IU in a nutshell $2B Annual Budget One university with 8 campuses; 90.000 students, 3.900 faculty 878 degree programs, including nation s 2nd largest school of medicine IT Organization: Vice President for IT & Provost: Michael A. McRobbie CIO: Bradley C. Wheeler 4 Divisions: Telecommunications, Teaching & Learning, University Information Services, Research and Academic Computing Several offices: Security, Human Resources, Communications >$100M/year IT budget

Timeline of key events 1997: IU sets goal to become leader in absolute terms in IT 1997: IU acquires 64 GFLOPS SGI Origin2000, upgrades IBM SP2 1999: $35M grant from Lilly Endowment to create Pervasive Technology Labs 1999: Indiana Governor Frank O Bannon approves I-Light network connection Purdue, Indiana, Bloomington 2000: $155M grant from Lilly Endowment for Indiana Genomics Initiative 2001: IU announces first US university-owned supercomputer with > TFLOPS peak 2003: IU announces 2.2 TFLOPS distributed Linux cluster, first distributed Linux cluster with > 1 TFLOPS achieved 2003: IU awarded grant to become part of the TeraGrid 2004: $35M grant from Lilly Endowment for METACyt 2005: IU announces strategic plan for Life Sciences, announces 20.4 TFLOPS BladeCenter

Some life science innovations that involve cyberinfrastructure 10

Information access - Idealized View Lab Results User Clinical Data Toxicity Data

Leskell Gamma Knife 60Co sources Used to treat inoperable tumors Lance Armstrong was treated in IU s Gamma Knife Helmet with collimators Shielding Beam channel Treatment couch Shielding doors Helmet in treatment position

Gamma Knife An idealized head model is used for target planning When treatment fails to be successful, the primary problem thought to be targeting Solution: use a model of individual patient s head to plan targeting

Collaborative Initiative on Fetal Alcohol Spectrum Disorder

MutDB

Global Analysis of Arthropod Evolution Winner, Most geographically distributed application, High Performance Computing Challenge at SC2003. Created a global grid of computers including 14 systems; 8 types of systems; 6+ vendors; 641 processors; 9 countries, 6 continents (every continent except Antarctica). Demonstration accomplished computationally intensive evolutionary research

Basic computer science research and cyberinfrastructure 17

Pervasive Technology Labs The mission of the Pervasive Technology Labs at Indiana University is to: perform leading-edge research based on the pervasiveness of information technology in our world, creating new inventions, devices, and software that extend the capabilities of information technology in advanced research and everyday lives; attract, encourage, educate, and retain the workforce of tomorrow for the State of Indiana and educate the residents of the State generally about the value of advanced technology; accelerate economic growth in the State of Indiana through the commercialization of new software and inventions; and develop an income stream through external grant funding and technology transfer revenue that will lead toward selfsustainability for the Labs. In carrying out its mission, the Pervasive Technology Labs will help Indiana University attain a position of international leadership in information technology research and enhance the prosperity of the entire State of Indiana.

Community Grids Lab Director: Geoffrey Fox Focus on Collaboration Using the network to help groups work together. Information and Computing Grid Science Applications Biocomplexity Earthquake Science Fusion Energy

Open Systems Lab Director Andrew Lumsdaine Software for Supercomputers and Science Tools for building very complex systems. Open MPI Boost graph library

IU Cyberinfrastructure 21

Data Capacitor Short term management of massive amounts of data Data produced by instruments, data as part of workflows, data stored temporarily, summarized, and thrown away

Basic system statistics Computational systems IBM BladeCenter Cluster: 20.4 TFLOPS (512 dual processor JS21 Blades, each with two dual-core PowerPC 970 MPs). 8 GB RAM per Blade IBM p575 cluster: 1.6 TFLOPS (8-way Power-5 based nodes half with 16 GB RAM, half with 32 GB RAM) 1.15 PB spinning disk 650 as SAN supporting research systems: GPFS parallel disk system Network Attached Storage AFS Cell 500 TB for Data Capacitor

Challenges in HPC Management of the size of the system Reliability and resilience Electrical and cooling facilities Management of use of the system Levels of parallelism Reliability and resilience (Open MPI) Performance (OTF, Vampir NG)

Long term data management Life science data is essentially irreplaceable Duplicate copies of archived data kept in Bloomington and Indiana ~ 1 PB of spinning disk overall, Currently 2 PB of tape and adding 1 PB per year Metadata services, management of provenance, and management of availability of data are key areas of focus for us in the future Data provided via web services will be a key matter as well Putting management and retrieval of massive data stores, and use of that infrastructure in massive simulations, to produce meaningful results and important innovation is a key problem

TeraGrid Collaborative development of new computer technologies and delivery of new scientific innovations Linked systems massive, dynamic

It takes more than just science some thoughts about strategy and execution 28

Tech transfer by RAC: John-E-Box Invented by John N. Huffman, John C. Huffman, & Eric Wernert, IU

Creating the 21 st Century Workforce RAC/UITS 2004 Grace Hopper Celebration 2005 Richard Tapia Conference PTL Outreach to Native Americans Graduate Students PTL & RAC Outreach to lay public Indianapolis Museum of Art, Indiana State Museum

Promoting Indiana - SuperComputing Conference IU and Purdue collaboration on booths starting in 2000 Excellent national attention Helped build many collaborations, including successful TeraGrid proposal

Research & Academic Computing Strategy Our Work Front Office Back Office Our Objective Reliable Services Co-Creating the Future Researcher Consulting & Education Grant Initiation, Collaboration, Fulfillment Systems Administration Engineering Computing Frontiers Dr. Kate Pilachoski, Professor of Astronomy

Double External Funding by AY10-11 Win Grant $$ Deliver Results Acquire IT & Staff Grant Initiation, Collaboration, Fulfillment Develop Competencies Ever Advancing Frontiers High Performance Computing Mass Research Storage Visualization Networks (Telecom) Consulting (Stat, Linux) Digital Libraries Engineering Computing Frontiers Researcher Consulting & Education Systems Administration RAC Works via Relationships & Technical and Domain Competence

Execution: Funding and Staffing

Some final thoughts on life science and cyberinfrastructure collaborations Chicken and egg problem, or bank robbery? There are lots of opportunities open for HPC centers willing to take the effort to cultivate relationships with biologists and biomedical researchers but we as HPC exerts will have to go where the biologists are There are LOTS of opportunities available for universities willing to commit to information technology and life sciences as joint strategies The joint IT/Life sciences strategy for IU is working for the State of Indiana There are many similarities between the strategies of TU-D and IU!

Acknowledgments Funding for projects described in this talk has come from the National Science Foundation, National Institutes of Health, Lilly Endowment, Inc., State of Indiana (particularly through support of I- light Initiative and the 21st Century Fund) The work described here was made possible by the faculty, students, and staff of Indiana University. Thanks especially to the staff of RAC, CPO, Telecommunications, PTL, UITS generally, the participants in the Indiana Genomics Initiative, and the participants in the METACyt Initiative. Several of the slides and ideas presented here were developed by colleagues or collaborators the Research and Academic Computing Division of UITS in general, and Dick Repasky in particular. Stewart s visit to Dresden is funded in part by the Center for the International Exchange of Scholars, the Technical University of Dresden, and Indiana University. THANKS!

For additional info rac.uits.indiana.edu/ www.iu.teragrid.org/ Life Sciences Strategic Plan - http://lifesciences.iu.edu/