The Bump in the Road to Exaflops and Rethinking LINPACK

Similar documents
December 10, Why HPC? Daniel Lucio.

The end of Moore s law and the race for performance

Challenges in Transition

Exascale Initiatives in Europe

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Re-Visiting Power Measurement for the Green500

CP2K PERFORMANCE FROM CRAY XT3 TO XC30. Iain Bethune Fiona Reid Alfio Lazzaro

ETP4HPC ESD Workshop, Prague, May 12, Facilitators Notes

Trends in Supercomputing and Evolution of the San Diego Supercomputer Center:

NRC Workshop on NASA s Modeling, Simulation, and Information Systems and Processing Technology

Barcelona Supercomputing Center

Parallel Programming I! (Fall 2016, Prof.dr. H. Wijshoff)

Building a Cell Ecosystem. David A. Bader

Special Contribution Japan s K computer Project

ADVANCES IN BIG DATA AND EXTREME SCALE COMPUTING ( BDEC ) William M. Tang

The Spanish Supercomputing Network (RES)

Flexible In Situ with ParaView!

Impact from Industrial use of HPC HPC User Forum #59 Munich, Germany October 2015


Sourcing in Scientific Computing

HIGH-LEVEL SUPPORT FOR SIMULATIONS IN ASTRO- AND ELEMENTARY PARTICLE PHYSICS

THE EARTH SIMULATOR CHAPTER 2. Jack Dongarra

Supercomputers have become critically important tools for driving innovation and discovery

The role of prototyping in the overall PRACE strategy

Enabling Scientific Breakthroughs at the Petascale

Programming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Labs CDT 102

Exascale Challenges for the Computational Science Community

The Future of Intelligence, Artificial and Natural. HI-TECH NATION April 21, 2018 Ray Kurzweil

SEAM Pressure Prediction and Hazard Avoidance

High Performance Computing: Infrastructure, Application, and Operation

Early Science on Theta

Programming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Boot Camp

Technological Forecasting of Supercomputer Development: The March to Exascale Computing

Preparing Applications for Next-Generation HPC Architectures. Andrew Siegel Argonne National Laboratory

Thoughts on Reimagining The University. Rajiv Ramnath. Program Director, Software Cluster, NSF/OAC. Version: 03/09/17 00:15

Computing center for research and Technology - CCRT

INCITE Proposal Writing Webinar April 24, 2012

High Performance Computing and Modern Science Prof. Dr. Thomas Ludwig

It s Time to Redefine Moore s Law Again 1

Vector-Based Metrics for Assessing Technology Maturity

Trend of Software R&D for Numerical Simulation Hardware for parallel and distributed computing and software automatic tuning

Hardware Software Science Co-design in the Human Brain Project

DARPA's HPCS Program: History, Models, Tools, Languages

Quick Reaction Capability for Urgent Needs

SCAI SuperComputing Application & Innovation. Sanzio Bassini October 2017

NVIDIA GPU TECHNOLOGY THEATER AT SC13

FROM KNIGHTS CORNER TO LANDING: A CASE STUDY BASED ON A HODGKIN- HUXLEY NEURON SIMULATOR

STRATEGIC FRAMEWORK Updated August 2017

IESP AND APPLICATIONS. IESP BOF, SC09 Portland, Oregon Paul Messina November 18, 2009

Center for Hybrid Multicore Productivity Research (CHMPR)

MACHINE LEARNING Games and Beyond. Calvin Lin, NVIDIA

The Path To Extreme Computing

The Value of Membership.

INCITE Program Overview May 15, Jack Wells Director of Science Oak Ridge Leadership Computing Facility

Computer Architecture

The digital journey 2025 and beyond

Super Comp. Group Coordinator, R&D in IT Department of Electronics & IT. gramaraju. CDAC Pune, 8 Feb

The Exponential Promise of High Performance Computing Prof. Dr. Thomas Ludwig

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Outline. PRACE A Mid-Term Update Dietmar Erwin, Forschungszentrum Jülich ORAP, Lille, March 26, 2009

Domestic Reform and Global Integration: The Evolution of China s Innovation System and Innovation Policies

Extreme Scale Computational Science Challenges in Fusion Energy Research

National e-infrastructure for Science. Jacko Koster UNINETT Sigma

Exascale-related EC activities

Scientific Computing Activities in KAUST

High Performance Computing Scientific Discovery and the Importance of Collaboration

Hardware-in-the-Loop Testing of Wireless Systems in Realistic Environments

EHR Optimization: Why Is Meaningful Use So Difficult?

Fast and Scalable Eigensolvers for Multicore and Hybrid Architectures

CUDA-Accelerated Satellite Communication Demodulation

The Exascale Computing Project

PRACE A Mid-Term Update Dietmar Erwin, Forschungszentrum Jülich Event, Location, Date

NUIT Support of Researchers

High Performance Computing Facility for North East India through Information and Communication Technology

Computational Simulations of The World s Biggest Eye on GPUs

Supplementary Figures

The Sherwin-Williams Company

Stephen Plumb National Instruments

DoD Research and Engineering Enterprise

SEVENTH FRAMEWORK PROGRAMME Research Infrastructures

5th International Symposium - Supercritical CO2 Power Cycles March 28-31, 2016

GPU ACCELERATED DEEP LEARNING WITH CUDNN

Quartz Lock Loop (QLL) For Robust GNSS Operation in High Vibration Environments

Climate Change Innovation and Technology Framework 2017

Air Force Research Laboratory

Measurements and Metrology for 5G

SKA Phase 1: Costs of Computation. Duncan Hall CALIM 2010

Addressing the Changing Role of Engineering Simulation. Analysis, Simulation and Systems Engineering Software Strategies

Imagine your future lab. Designed using Virtual Reality and Computer Simulation

Creating the Right Environment for Machine Learning Codesign. Cliff Young, Google AI

Liquid Benchmarks. Sherif Sakr 1 and Fabio Casati September and

BETTER THAN REMOVING YOUR APPENDIX WITH A SPORK: DEVELOPING FACULTY RESEARCH PARTNERSHIPS

Document downloaded from:

Software-Intensive Systems Producibility

President Barack Obama The White House Washington, DC June 19, Dear Mr. President,

From Cloud Computing To Online Gaming. Mark Sung General Manager zillians.com

Post K Supercomputer of. FLAGSHIP 2020 Project. FLAGSHIP 2020 Project. Schedule

PEAK GAMES IMPLEMENTS VOLTDB FOR REAL-TIME SEGMENTATION & PERSONALIZATION

Greg Hager, Johns Hopkins University

Operations at Scale; Lessons to be Remembered

Transcription:

The Bump in the Road to Exaflops and Rethinking LINPACK Bob Meisner, Director Office of Advanced Simulation and Computing The Parker Ranch installation in Hawaii 1

Theme Actively preparing for imminent profound shift in computing architectures by making computing investments NOW in H/W and S/W We can t wait for an initiative to save us BUT, an initiative would enable a comprehensive approach to building exascale system Exascale isn t about an exaflop, but about how effectively we transition to a new era of computing A better corrolated benchmark would help 2

Episodic disruption defines high-end computing Megascale Gigascale Terascale Petascale 1E+18 1E+15 1E+12 Blue Mtn. Purple White BG/L Roadrunner Sequoia ASCIQ Massively Tiered Computing Cielo 1E+09 1E+06 Peak Flops Pioneering MD 870 atoms Berni Alder, 1962 Mainframe Vector Distributed Memory (MPP) 9B atoms Kelvin Helmholtz Architectural stability has made possible remarkable advances in science. But, programming model transitions are tough and we are approaching one now. 3

The next disruption is NOW Perf/thread Figure courtesy of Kunle Olukotun, Lance Hammond, Herb Sutter, and Burton Smith, 2004 New Epoch is forcing us to address issues in several broad areas: Exponentially growing parallelism Data movement management System complexity Application code evolution 4

Continuing to advance computational science will require mastering architectural complexity Resolution increases have led to critical scientific insights and further increases are necessary for continued progress Science requirements drive the need for higher performance computers, while computational progress depends on successfully transitioning to complex architectures MD plasma simulation Combustion simulation Global climate simulation 5

We are in a new era of computing and need to quickly adapt our codes 1000 Pflops/sec 100 10 Loss caused by insufficient memory and bandwidth Current Code Performance 1 0.1 Year Stagnation of code performance Unless we take action, our future will be keeping performance from deteriorating rather than improving 6

An exascale initiative may be our long term salvation, but we need a short term life jacket as well Scientific simulations must be ready for new architectures To prepare for the dramatic, impending changes we are pursuing : Partnerships with industry to develop advanced processor, memory and interconnect technologies Investments in software environments and application codes Non-Recurring Engineering (NRE) investments We are investigating a new metric to confirm the performance of high-end computers 7

Promoting industry innovation through codesign Formed partnerships with multiple companies to accelerate the R&D of critical technologies needed for extreme-scale computing Targeted innovative new and/or accelerated R&D of technologies for productization in the 5 10 year timeframe $25.4 million focusing on interconnect architectures and implementation approaches $62.5M focusing on processor/memory and storage Future investments planned NRE also critical to move vendors in suitable directions 8

Entering a new episode in HPC-- Rethinking the community benchmark We have entered a new era in HPC architectural complexity and need to move beyond High Performance LINPACK (HPL) as a metric 9

HPL: Pros Easy to run Easy to understand Easy to check results Good tool for community outreach Understandable to the outside world Historical database of performance information 10

HPL: Cons Has poor balance of floating point and data movement compared to modern codes Overall usability of a system is not measured Used as a marketing tool Can require long run times wasting valuable resources Not sensitive to new architectural features Does not have sufficient fidelity for procurements 11

Promote the pros, fix the cons-- Evolving the community benchmark Develop a new metric that correlates with important scientific and technical apps not well represented by HPL Replicate the good (enduring) features of HPL Replace the outdated features Accurately predict rankings for a target suite of scientific applications Encourage vendors to focus on architectural features needed for high performance on important scientific and technical apps Not intended to define procurements 12 PLUS--Support a historical record of performance information on existing and future systems 12

Proposal: HPCG for ranking scientific systems High Performance Conjugate Gradient (HPCG) Solve Ax=b, A large, sparse, b known, x computed Physics-based A matrix Contains communication patterns that are prevalent in a variety of methods for discretization and numerical solution of PDEs More relevant patterns of computation: Dense and sparse computations Dense and sparse collectives Data-driven parallelism 13 13

HPC Technical Reports HPCG Technical Specification Jack Dongarra, Michael Heroux, Piotr Luszczek Toward a New Metric for Ranking High Performance Computing Systems Jack Dongarra and Michael Heroux SANDIA REPORT SAND2013-!8752 Unlimited Release Printed October 2013 HPCG Technical Specification Michael A. Heroux, Sandia National Laboratories 1 Jack Dongarra and Piotr Luszczek, University of Tennessee Prepared by Sandia National Laboratories Albuquerque, New Mexico 87185 and Livermore, California 94550 Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. Approved for public release; further dissemination unlimited.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 1 Corresponding Author, maherou@sandia.gov 14

HPCG results presented at ISC 2014 June 2014 Top 500 Site Computer Cores Peak (Pflops) HPL RMAX (Pflops) HPCG (Pflops) HPCG/R MAX 1 2 4 5 National Super Computer Center in Guangzhou DOE / OS Oak Ridge Nat Lab Tianhe-2 NUDT,Xeon 12C 2.2GHz + IntelXeon Phi (57c) + Custom Titan, Cray XK7 (16C) + Nvidia Kepler GPU (14c) + Custom K computer Fujitsu RIKEN Advanced Inst SPARC64 VIIIfx (8c) + for Comp Sci Custom MIRA DOE / OS Argonne Nat Lab BlueGene/Q, Power BQC 16C 1.60GHz, Custom 3,120,000 54.90 33.9 0.58 1.71% 560,640 27.10 17.6 0.322 1.83% 705,024 11.30 10.5 0.426 4.06% 786,432 10.10 8.59 0.101 1.18% 6 Swiss CSCS Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom 115,984 7.80 6.27 0.099 1.58% 11 HPC2 Intel Xeon 10C 2.8 GHz + Nvidia Kepler (14c) + IB 62,640 4.00 3 0.0489 1.63% HPCG is real and has been run on several systems Performance is consistent with our expectations and experience 15

Comments on early HPCG benchmark results The disparity between HPL and HPCG is not a surprise, it s a fact of life The results reflect the intrinsic nature of many challenging scientific applications: climate, combustion, turbulence, etc These are typical of the currently available systems for mission-critical applications Not all vendors have developed optimized versions 16

In Summary 17 The transition to the next era in high-end computing is going to affect all scientific computer users long before an exaflop system is available We need to take a comprehensive approach to next-gen platforms We are preparing for the inevitable and significant changes through Hardware and software codesign efforts Funded collaborations with industry to ensure that exascale architecture computers will meet our scientific computing needs Application code redesign to address expected processor, memory and storage changes We are investigating new, more informative ways to measure performance

Thank You Robert E. Meisner Office of Advanced Simulation and Computing 18