Enabling Scientific Breakthroughs at the Petascale

Similar documents
Artificial intelligence, made simple. Written by: Dale Benton Produced by: Danielle Harris

PAPER AVIAT NETWORKS FOUR RECOMMENDATIONS FOR FIRSTNET BACKHAUL

High Performance Computing and Modern Science Prof. Dr. Thomas Ludwig

Revolutionizing Engineering Science through Simulation May 2006

Extending On-Premises Network-Attached Storage to Google Cloud Storage with Komprise

National Instruments Accelerating Innovation and Discovery

At last, a network storage solution that keeps everyone happy

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

A Case Study on the Use of Unstructured Data in Healthcare Analytics. Analysis of Images for Diabetic Retinopathy

The Technology Economics of the Mainframe, Part 3: New Metrics and Insights for a Mobile World

PoS(ISGC 2013)025. Challenges of Big Data Analytics. Speaker. Simon C. Lin 1. Eric Yen

WHITE PAPER. Spearheading the Evolution of Lightwave Transmission Systems

ipad Total Cost of Ownership: the Cost Savings and of a Mid-Year Refresh

Case Study. When dogfooding is not enough: QA testing for Headspace s global expansion

Staff get data back just hours after fire guts The Academy, Selsey. Redstor to the rescue after disaster strikes

Source: REUTERS/Reinhard Krause

December 10, Why HPC? Daniel Lucio.

TRANSFORMING DISRUPTIVE TECHNOLOGY INTO OPPORTUNITY MARKET PLACE CHANGE & THE COOPERATIVE

N E T W O R K UPGRADE SOLUTIONS UPGRADE YOUR MPT NETWORK YOUR WAY

NetApp Sizing Guidelines for MEDITECH Environments

In addition to wide-area monitoring systems, synchrophasors offer an impressive range of system benefits, including:

ACCELERATING TECHNOLOGY VISION FOR AEROSPACE AND DEFENSE 2017

Economies of the Commons 2, Paying the cost of making things free, 13 December 2010, Session Materiality and sustainability of digital culture)

MAKING IOT SENSOR SOLUTIONS FUTURE-PROOF AT SCALE

WHO WE ARE MISSION STATEMENT

Global Alzheimer s Association Interactive Network. Imagine GAAIN

HARNESSING TECHNOLOGY

SMART MANUFACTURING: A Competitive Necessity. SMART MANUFACTURING INDUSTRY REPORT Vol 1 No 1.

Canada s Most Powerful Research Supercomputer Niagara Fuels Canadian Innovation and Discovery

A New Path for Science?

High-performance computing for soil moisture estimation

Why, How & What Digital Workplace

Big Data Best Practice

Regulatory Science and Innovation: FDA s Role in Transformation of the MCM Enterprise

Introduction to the X PRIZE Foundation

Digital Engineering. Phoenix Integration Conference Ms. Philomena Zimmerman. Deputy Director, Engineering Tools and Environments.

Dicing The Data from NAB/RAB Radio Show: Sept. 7, 2017 by Jeff Green, partner, Stone Door Media Lab

in which I will not talk about the elephant whale in the room

The New Imperative: Collaborative Innovation. Dr. Anil Menon Vice President, Corporate Strategy IBM Growth Markets

Privacy and the EU GDPR US and UK Privacy Professionals

Impact from Industrial use of HPC HPC User Forum #59 Munich, Germany October 2015

Executive Summary. Chapter 1. Overview of Control

UPGRADE YOUR MPT NETWORK THE SMART WAY. harris.com #harriscorp

High-performance inflight connectivity for business aviation

DIGITAL TRANSFORMATION LESSONS LEARNED FROM EARLY INITIATIVES

In 1984, a cell phone in the U.S. cost $3,995 and

25 CORE ASTRO FLEXIBLE, SCALABLE CONFIGURATIONS

How Kier saved 170,000 with a software-defined architecture

Center for Hybrid Multicore Productivity Research (CHMPR)

MORE POWER TO THE ENERGY AND UTILITIES BUSINESS, FROM AI.

COURSE 2. Mechanical Engineering at MIT

High Performance Computing i el sector agro-alimentari Fundació Catalana per la Recerca CAFÈ AMB LA RECERCA

Executive Summary Industry s Responsibility in Promoting Responsible Development and Use:

The Future is Proximal Why cloud fails IoT

First Experience with PCP in the PRACE Project: PCP at any cost? F. Berberich, Forschungszentrum Jülich, May 8, 2012, IHK Düsseldorf

2018 Research Campaign Descriptions Additional Information Can Be Found at

The Sherwin-Williams Company

Computational Scalability of Large Size Image Dissemination

Establishment of a Multiplexed Thredds Installation and a Ramadda Collaboration Environment for Community Access to Climate Change Data

Michael P. Ridley, Director. NYSTAR High Performance Computing Program

ACCELERATE SOFTWARE DEVELOPMENT WITH CONTINUOUS INTEGRATION AND SIMULATION

High Performance Computing in Europe A view from the European Commission

TWEET LIKE A ROCKSTAR

Breakthrough Innovation The real Innovator s Dilemma

Earth Cube Technical Solution Paper the Open Science Grid Example Miron Livny 1, Brooklin Gore 1 and Terry Millar 2

The Spanish Supercomputing Network (RES)

UNIT-III LIFE-CYCLE PHASES

Supercomputers have become critically important tools for driving innovation and discovery

Low-Cost, On-Demand Film Digitisation and Online Delivery. Matt Garner

Perspectives of development of satellite constellations for EO and connectivity

Business Leaders: Thought and Action. Corporate Mergers Can Create Winners: A Case in Point

DIGITAL INNOVATION MANUFACTURING EXECUTIVE. The Best Strategy for Reclaiming U.S. Manufacturing Jobs Is...

TRANSFORMING DISRUPTIVE TECHNOLOGY INTO OPPORTUNITY INNOVATION AT THE EXECUTIVE AND BOARD LEVEL

High Performance Computing Systems and Scalable Networks for. Information Technology. Joint White Paper from the

High Performance Computing

(Beijing, China,25 May2017)

Gizmos really With Gizmos, allow students kids get it. to go deeper in Explore, Discover and Experiment Gizmos terms of problem solving and

Health Informaticians Drive Innovation from Bench to Bedside

ARTEMIS The Embedded Systems European Technology Platform

Navigating The Fourth Industrial Revolution: Is All Change Good?

Why opt for SIP Trunking?

Master in Computer Science & Business Technology Your gateway to build the tech of the future

Business benefits of microservices

Broadening the Scope and Impact of escience. Frank Seinstra. Director escience Program Netherlands escience Center

Coaching Questions From Coaching Skills Camp 2017

Read & Download (PDF Kindle) The Art Of Scalability: Scalable Web Architecture, Processes, And Organizations For The Modern Enterprise (2nd Edition)

REIMAGINING THE LOCAL GOVERNMENT OPERATION MODEL

Challenging convention with technological ingenuity

Technical keys to successful network modernization: PIM

Critical Communications State of the Play

Leverage 3D Master. Improve Cost and Quality throughout the Product Development Process

Scientific Computing Activities in KAUST

Fiber Bragg Grating Dispersion Compensation Enables Cost-Efficient Submarine Optical Transport

Any Science Gateway s Dream Why is it so hard? Any Science Gateway s Dream There are worlds between

72 of the Best Lessons for Leadership Success

Fifth Grade Science Curriculum

Executive Summary FUTURE SYSTEMS. Thriving in a world of constant change

GaN is Finally Here for Commercial RF Applications!

ATLAS. P25 Systems. LMR communications made simple.

PEAK GAMES IMPLEMENTS VOLTDB FOR REAL-TIME SEGMENTATION & PERSONALIZATION

Transcription:

Enabling Scientific Breakthroughs at the Petascale Contents Breakthroughs in Science...................................... 2 Breakthroughs in Storage...................................... 3 The Impact on Future Storage Systems............................... 4 A Unique Responsibility...................................... 4 Brought to you compliments of: If you think the requirements for enterprise storage systems are growing at a dizzying pace, try these numbers on for size: 25 petabytes of storage in more than 36 cabinets with more than 17,000 disks capable of delivering greater than 1 terabyte-per-second of I/O bandwidth performance to more than 25,000 compute nodes. These are the specifications for the Blue Waters supercomputer and Lustre-based parallel storage systems project at the University of Illinois National Center for Supercomputing Applications (NCSA). The massively scalable parallel storage system is designed and being delivered by Cray Inc. Scalable parallel file systems aren t easy to build. The leaders at NCSA turned to Cray to build the system and come up with an integrated and modular storage solution to meet the staggering requirements of what is one of the most powerful supercomputers in the world. The HPC storage solution has been deployed in a rapid time frame and at an unprecedented scale. The Blue Waters system will enable scientists all around the U.S. to conduct research with sustained performance they have never been able to achieve before. The research will potentially lead to breakthroughs in areas such as AIDS research, severe weather prediction and many other areas of science that could improve the quality of life all across the planet. Beyond that, the work done by Cray and the NCSA team to create a new paradigm in storage will eventually impact commercial companies, the technical enterprise and other HPC and big data storage solutions, creating innovations that will enable faster, more effective, consolidated, more scalable and higher capacity storage systems. These will pave the way for far more powerful and effective HPC computing deployments (on- and off-premise) and data centers capable of delivering unparalleled levels of business and IT agility. 2012 Cray Inc.

Breakthroughs in Science Blue Waters is a $208 million project of the National Science Foundation. The goal of the project has been to deliver a supercomputer capable of delivering sustained performance of 1 petaflop for a range of real-world scientific and engineering applications. The sustained petascale computing enabled by Blue Waters is available on only a very few supercomputers in the world, and Blue Waters is the only system of this scale provided by the National Science Foundation. The mission of Blue Waters is to support a very diverse and wide range of science at scales and performance levels that are unprecedented, says Bill Kramer, deputy project director for the Blue Waters Project at NCSA. At the simplest level, the sustained performance enabled by Blue Waters means scientists can get to the right results faster than anything that exists. It will lead to breakthroughs in what is really achievable by the science community. The power of Blue Waters is made available to science teams all over the country, each of which must go through a rigorous application process to be approved by the NSF. Most of the resources will go towards about 35 very critical science and engineering projects, Kramer says. Using Blue Waters, scientists will be able to make new breakthroughs on tough, highly challenging problems such as predicting the behavior of complex biological systems, understanding how the cosmos evolved after the Big Bang, designing new materials at the atomic level, predicting the behavior of hurricanes and tornadoes and simulating complex engineered systems. The contributions of the science teams using Blue Waters will be to enable significant, possibly unprecedented insights using innovative analysis and simulation methods. The impact of Blue Waters is already being felt. We had about a fifth of the machine for a select group of science teams that were part of the early process, says Greg Bauer, technical program manager for the Science and Engineering Application Support Group. It gave us a three-month window of science at a substantially larger scale. Projects have involved scientists investigating how a virus, such as HIV or polio, can enter and inject their genetic material through the wall of a cell. Or how new proteins are protected while they are forming from RNA sequences. Another important early learning has been the tremendous advances in speed and productivity enabled by Blue Waters. Some of the science teams have been able to do three years worth of what they had normally been able to do in just three months since they got a lot more work done in a much shorter time, which is the working definition of sustained computing says Michelle Butler, senior technical program manager and head of storage and networking. Blue Waters is enabling better science, Kramer adds, noting that scientists can improve clarity and accuracy by running many more models and simulations in a much more condensed period of time. Scientists are getting a better understanding of how clouds form. They are getting 100 times more resolution than they have been able to get before, so when you look at something like severe weather with a tornado, wind temperature, size of land, amount of rain they are now doing things you can t study today on any other system. 2 2012 Cray Inc.

Breakthroughs in Storage In order to deliver these tremendous breakthroughs in science, the team at NCSA had to find a technology partner that could deliver comparable breakthroughs in the underlying storage infrastructure to support capacity and performance requirements that were also unprecedented in size, scale and scope. Cray has been able to address all of the challenges and do it within a very tight time frame. Cray received the contract in November 2011. As of September 2012 just 10 months from the signing of the contract all of the equipment had been delivered and was functioning at scale. The NCSA team cites several critical factors to Cray s design, including a systematic approach to the entire solution and Cray s vast experience as the world s leading provider of supercomputer systems and large-scale parallel file systems such as Lustre. What Cray has done is design and build a system that is huge, Butler says. I don t know of any other system with the requirements for scalability that we are putting together for such huge applications. The performance will be above a terabyte per second, which is a measured number on which Blue Waters users will be able to rely. The machine can fill up this file system in a few hours. Butler says the systematic approach taken by Cray has been critical in deploying a solution that is highly flexible for scientific applications. We could have easily spent our money just to support floating-point peak operations, but we put a lot of our investment in the storage system and the amount of memory as well as the compute capacity, she says. One of the aspects of Blue Waters is that it s a very balanced system. We know that the projects approved will be approved for their system science and societal impacts, so we need a balanced system architecture that can address all problems in a very efficient manner. With this system we can move any which way. For example, data-intensive computing or analytics is a rapidly growing area we can support. We aim to be the best in the world at solving problems that are extremely hard high performance, data coming from anywhere and everywhere at high volumes, data that is not necessarily coordinated, unprecedented simulations. The system in place is already using thousands of NL-SAS disk drives, the largest of which are 2TBs each. The system also uses solid-state drives for certain portions of the file system. The key is that every aspect of the solution has been built as a single system: All switches, cables, disks and servers are integrated into a single unit. By taking this systematic approach, Cray has been able to reduce power consumption, increase density and deploy the solution cost effectively. Another important design feature in the storage solution for Blue Waters is the Lustre distributed file system. Cray s implementation of Lustre provides high levels of scalability, reliability and stability for continuous operations in an open-source, non-proprietary environment. I m not sure we could scale any other file system at this point, and even if it could be done with another approach it is even more unlikely we could have done it for the same cost profile. Kramer says. With the Cray Lustre environment we are making that file system highly reliable, highly performant and very usable not just to be run in a lab but, for instance, it could be used in high-end financial services to predict markets. 3 2012 Cray Inc.

The Impact on Future Storage Systems The innovations created by Cray in building the Blue Waters storage solution are already making their way into commercial products and many more of the breakthroughs will eventually move out of the scientific community and into enterprise environments. The concept of building an integrated system for scale, performance and cost efficiencies could lead to new approaches for large-scale environments. And, as Kramer notes, Cray s Lustre innovations could have a huge impact on systems of all sizes and scale. Other key lessons and innovations from Cray and Blue Waters are coming in from a variety of areas, ranging from backup and archiving to managing both online and nearline storage in an integrating manner to overall management of storage systems of tremendous size, scope and scalability. One of the file systems we have on the floor is two petabytes, Butler says. We are planning to back that up but I ve never backed up two petabytes. We have to learn how we are going to scale our processes to that size of machine. As other companies go to systems of that size, we ll have broken ground on how to back up, restore and deliver performance. There will be a lot of proving ground that everyone can take advantage of: How do you do quota management at this size; how do you eliminate I/O wait; how do you get a file out of one machine and onto another cache or nearline machine. A lot of these management functions will be in place in a few years. Butler also pointed to archiving as a new challenge at such a large scale. Our nearline subsystem will be on the order of 380 to 500 petabytes, which is accessible to any science team within moments, she says. I know of two other archives in the 50-100 petabytes range, but nowhere near this, so it s another really large deployment where we have to learn about how to manage it, how to scale it and how to manage and assess risks. Also, writing something into this environment when you are writing data to a single tape, it takes a really long time. For tape archiving, the Blue Waters team developed a new technology called RAIT (redundant array of indexed tapes). Kramer says this approach results in improved protection of data at a more reasonable cost than other solutions. This can have a big impact on cost of ownership, he says. Working with Cray to address storage challenges has been extremely important to the NCSA team. Storage has become more and more important in scientific computing and it has increasingly become a bottleneck in recent years, Kramer says. We ve found that about 20 to 25 percent of teams have been limited by how they are doing file management and data I/O. In areas such as traditional simulation and data collection, we are seeing that they are now coming together and have to be treated simultaneously. It s a whole new area for data-driven, data-intensive work. Blue Waters will be able to enable that. A Unique Responsibility For the team at Blue Waters and the team at Cray, building these kinds of innovative advances in computing technology is more than a challenge: It s a unique responsibility. Some of the most important science projects taking place anywhere in the world are dependent on Blue Waters and its underlying storage system to do work that can have a huge impact for generations to come. 4 2012 Cray Inc.

My reason for doing this work versus other things I could do is that we have a lot of motivation doing things that are good for quality of life, that impact science that may even save lives through more accurate predicting of weather, for example, Kramer says. We can have an indirect influence on better aircraft design, or how the AIDS virus effects cells, handling the spread of disease, or perhaps developing a new drug. Something like Blue Waters can have an overall effect of making life better for lots of people. For Blue Waters to have that kind of impact, the underlying storage had to be rock solid, enabling levels of performance, capacity, scale and scope that have broken and continue to break new ground. By breaking down the challenge and taking an integrated, systematic approach, Cray has been able to achieve the necessary breakthroughs to enable new levels in computer storage. 5 2012 Cray Inc.