RAPS ECMWF. RAPS Chairman. 20th ORAP Forum Slide 1

Similar documents
Central and Eastern Europe Statistics 2005

NFC Forum: The Evolution of a Consortium

Chem & Bio non-proliferation

Munkaanyag

New era for Eureka - relations with ETPs

National Census Geography Some lessons learned and future challenges in European countries

Eurovision Song Contest 2011

UMTS Forum key messages for WRC 2007

This document is a preview generated by EVS

English - Or. English NUCLEAR ENERGY AGENCY COMMITTEE ON THE SAFETY OF NUCLEAR INSTALLATIONS FINAL REPORT AND ANSWERS TO QUESTIONNAIRE

THE DIGITALISATION CHALLENGES IN LITHUANIAN ENGINEERING INDUSTRY. Darius Lasionis LINPRA Director November 30, 2018 Latvia

Economic crisis, European Welfare State Models and Inequality

EU Ecolabel EMAS Environmental Technology Verification (ETV) State-of-play and evaluations

Walkie Talkie APMP300. User manual

Munkaanyag

INTERNATIONAL CIVIL AVIATION ORGANIZATION

OECD s Innovation Strategy: Key Findings and Policy Messages

Economic and Social Council

OBN BioTuesday: Sources of Public Non-Dilutable Funding & Export Support to UK R&D Companies

H2020 Excellent science arie Skłodowska-Curie Actions. Your research career in Europe. 17 November 2015

Public Private Partnerships & Idea selection

Europe Turkey MCA Major Roads of South East Europe

Christina Miller Director, UK Research Office

Does exposure to university research matter to high-potential entrepreneurship?

Communicating Framework Programme 7. European Commission Research DG Pablo AMOR

SUSTAINABLE SUPPLY CHAINS. Making the relationship between TRADE, SOCIAL and ENVIRONMENTAL POLICIES more effective and mutually beneficial

Pre-Commercial Procurement (PCP) Actions

Europe Turkey MFD Major Roads of South East Europe

OBSTACLES AND OPPORTUNITIES FOR THE PECS INDUSTRY TO PARTICIPATE IN ESA PROGRAMMES SPACE4SME PROJECT. Prague April 25, 2008

Ref: Overview of the implementation of the TRIPS Agreement (patents) in the EPC contracting states and observer countries

SECTEUR Ascertaining user needs

the Reinsurance Mechanism

Economic Outlook for 2016

Monthly Summary of Troop Contribution to UN Operations

Who Reads and Who Follows? What analytics tell us about the audience of academic blogging Chris Prosser Politics in

EBA Master Class The Benefits of International Collaboration. Steve Morgan Co-Chair, EBA Benchmarking Group

5.0% 0.0% -5.0% -10.0% -15.0% 10.0% 5.0% 0.0% -5.0% -10.0% -15.0% 10.0% 5.0% 0.0% -5.0% -10.0% 16.00% 13.00% 10.00% 7.00% 4.

Communication systems for meters and remote reading of meters - Part 4: Wireless meter readout (Radio meter reading for operation in SRD bands)

Poland: Competitiveness Report 2015 Innovation and Poland s Performance in

Welcome to the IFR Press Conference 30 August 2012, Taipei

MINISTERIAL CONFERENCE EHEA PARIS 2018 WITH SPEAKERS AGENDA As of 18 MAY Moderator Paul de Brem Coffee break Espace Réaumur

This document is a preview generated by EVS

FINAL DRAFT TECHNICAL REPORT CLC/FprTR RAPPORT TECHNIQUE TECHNISCHER BERICHT January English version

Innovation in Europe: Where s it going? How does it happen? Stephen Roper Aston Business School, Birmingham, UK

Framework Programme 7 and SMEs. Amaury NEVE European Commission DG Research - Unit T4: SMEs

Rebuilding for the Community in New Orleans

ECTP & EurekaBuild. Jesús Rodríguez ECTP SG chairman EurekaBuild chairman DRAGADOS (ACS Group), Spain

QuickSpecs. VIVE Pro VR System with Advantage+ Service Pack. Overview

COST IC0902: Brief Summary

2018/2019 HCT Transition Period OFFICIAL COMPETITION RULES

Implementing the International Safety Framework for Space Nuclear Power Sources at ESA Options and Open Questions

Walkie Talkie APMP500. User manual

Public Consultation: Science 2.0 : science in transition

RECOVERED PAPER DATA

PU Flexible Foam Market Report Europe Ward Dupont EUROPUR President

This document is a preview generated by EVS

ISO INTERNATIONAL STANDARD

Western Europe Ford NX 2018

CRC Association Conference

Remote participation in Question sessions Audio options VoIP

ITEA 3: Seizing the High Ground in a Time of Change. ITEA 3 Kick-off Event, Vienna 23 April 2014 Prof. Dr. Rudolf Haggenmüller, Chairman ITEA 3

Knowledge and Innovation Community (KIC) Raw Materials update. Peter Moser, Alfred Maier & Katrin Brugger, Montanuniversität Leoben

This document is a preview generated by EVS

English version. Audio, video and similar electronic apparatus - Safety requirements

Joint Convention on the Safety of Spent Fuel Management and on the Safety of Radioactive Waste Management

Tolerances. Alloy groups. Tolerances

Economic benefits from making the GHz band available for mobile broadband services in Western Europe. Report for the GSM Association

Trade Barriers EU-Russia based in technical regulations

Realising the FNH-RI: Roadmap. Karin Zimmermann (Wageningen Economic Research [WUR], NL)

Background material 1

Job opportunities for scientists and engineers

SafeNano Norway in from concept to reality?

PROGRAM AT-A-GLANCE: PRINT ENGINES SPECIALIZATION EUROPE, MIDDLE EAST AND AFRICA

Process-Controller HPP-25

stripax The professional stripping tool

ASSESSMENT OF DYNAMICS OF THE INDEX OF THE OF THE INNOVATION AND ITS INFLUENCE ON GROSS DOMESTIC PRODUCT OF LATVIA

Regulatory status for using RFID in the UHF spectrum 3 May 2006

Centralised Services 7-2 Network Infrastructure Performance Monitoring and Analysis Service

Frame through-beam sensors

The compact test- disconnect terminal interface system for protection and secondary technology

WOODWORKING TECHNOLOGY IN EUROPE: HIGHLIGHTS European Federation of Woodworking Technology Manufacturers

This draft amendment A1, if approved, will modify the European Telecommunication Standard ETS (1996)

European Connected Health Alliance Bringing needs and solutions together for the Future of Health. ECHAlliance Update

ERA-Net Smart Grids Plus

Economic Dynamics and Structural Change

OECD Science, Technology and Industry Outlook 2008: Highlights

Process Control HPP-25

Competition SyStem The championship will be played within 11 days (9 game days plus 2 rest days).

What can POP do for you?

This document is a preview generated by EVS

PPP InfoDay Brussels, July 2012

Name Number of I/O points (words used) Model Antenna style. 512 outputs max. (32 words) Magnetic Base Antenna (1) --- WD30-AT001 (See note.

TECHNICAL PROFILES CATALOGUE 2016

Chapter 2: Effect of the economic crisis on R&D investment 60

Filing strategies in Europe

Radio frequencies designated for enhanced road safety in Europe - C-Roads position on the usage of the 5.9 GHz band

Western Europe 2017 FX

OECD Science, Technology and Industry Outlook 2010 Highlights

Confidence in SKYLON. Success on future engine test would mean "a major breakthrough in propulsion worldwide"

Engineered for optimised photo production Small angle rotation

Transcription:

RAPS George.Mozdzynski@ecmwf.int RAPS Chairman 20th ORAP Forum Slide 1

20th ORAP Forum Slide 2

What is RAPS? Real Applications on Parallel Systems European Software Initiative RAPS Consortium (founded early 90 s) Working group of hardware vendors Programming model (MPI + F90/F95 + OpenMP) The partners of the RAPS Consortium develop portable parallel versions of their production codes which are made available to a Working Group of Hardware Vendors for benchmarking and testing. 20th ORAP Forum Slide 3

RAPS Consortium CCLRC, Daresbury CSCS, Lugano DWD, Offenbach DKRZ, Hamburg, Reading Fraunhofer SCAI/ITWM UK Met Office, Exeter MPI-M, Hamburg METEO-FRANCE, Toulouse NERC, UK 20th ORAP Forum Slide 4

Working Group of Hardware Vendors Bull Cray Fujitsu Hitachi HP IBM INTEL Linux Networx NEC SGI SUN 20th ORAP Forum Slide 5

Why RAPS Portability of codes (F90/F95, C/C++, MPI, OpenMP) Availability of benchmark codes ahead of a formal procurement Some influence on standardization (PARMACS -> MPI) - Vendors needed to support F90 + MPI to run benchmarks Information exchange - 20 meetings held to date 20th ORAP Forum Slide 6

RAPS process RAPS benchmarks distributed by individual organizations No official membership required for vendors Vendors approach (NDA) individual orgs for benchmarks Meetings (once a year) - Every 2 years as part of Use of HPC in Meteorology workshop - http://www.ecmwf.int/newsevents/meetings/workshops/2006/high_performance_computing- 12th/index.html Next meeting 21/22 June 2007, UPMC, Jussieu, Paris - Contact: Marie-Alice.Foujols@ipsl.jussieu.fr 20th ORAP Forum Slide 7

RAPS benchmarks Today - IFS (RAPS9) - DWD LM_RAPS - Met Office UM Commitment to produce up to date benchmarks reflecting key operational applications of consortium members 20th ORAP Forum Slide 8

RAPS Future To seek new members outside of meteorological community Exchange experiences with other communities Fortran Standards - Fortran 2003 - HPCS compiler? 20th ORAP Forum Slide 9

: Supporting States and Co-operation Belgium Ireland Portugal Denmark Italy Switzerland Germany Luxembourg Finland Spain The Netherlands Sweden France Norway Turkey Greece Austria United Kingdom Co-operation agreements or working arrangements with: Czech Republic Croatia Estonia Hungary Iceland Lithuania Morocco Romania Serbia Slovenia 20th ORAP Forum Slide 10 ACMAD ESA EUMETSAT WMO JRC CTBTO CLRTAP

Phase3 hpcc & hpcd IBM p690+ Phase4 hpce & hpcf IBM p575+ Power4+ 1.9 GHz Peak 7.6 Gflops per PE Sustained ~.5 Gflops per PE Power5+ 1.9 GHz --> with SMT Peak 7.6 Gflops per PE Sustained ~1 Gflops per PE 2176 PEs per cluster 2240 PEs per cluster 32 PEs per node 16 PEs per node ---> 3*Mem BW per PE Same Federation Switch 20th ORAP Forum Slide 11

History of RAPS benchmark Gflop/s 10000 1000 100 10 IBM p690+ 2004 RAPS-8 T799 L91 IBM p575+ 2006 RAPS-9 T799 L91 CRAY T3E-1200 1998 RAPS-4 T213 L31 1 100 1000 10000 Number of processors 20th ORAP Forum Slide 12

T1279 16km NGPTOT = 2,140,704 TSTEP = 450 secs Flops for 10-day forecast = 7.207*10 15 684.1 650 600 550 500 450 400 350 300 250 200 150 100 50 N 50 N 0 20th ORAP Forum Slide 13 50 10

Comparison of Resolutions Resolution T1279 L91 T799 L91 T399 L62 Grid spacing 16km 25km 50km Number of grid-points 2,140,704 843,490 213,988 Time-step 450 secs 720 secs 1800 secs Flops for 10-day forecast 7.207*10 15 1.615*10 15 0.1013*10 15 EPS * 50 20th ORAP Forum Slide 14

RAPS9 10-day T799 L91 Forecast Percentage of Peak 16 14 12 10 8 6 4 2 0 0 500 1000 1500 2000 2500 hpce T1279 hpce T799 hpcd T799 Number of PEs 20th ORAP Forum Slide 15

RAPS9 T799 L91 10-day Forecast on hpce 6 5 2048 2240 Speed-up 4 3 2 1 384 768 1024 1536 Ideal T799 T1279 0 0 500 1000 1500 2000 2500 Number of PEs 20th ORAP Forum Slide 16

RAPS9 T799 L91 10-day Forecast on Cray XT3 at ORNL Speed-up 7 6 5 4 3 2 1 960 1920 3072 3940 5200 4800 6144 Ideal T799 0 0 2000 4000 6000 Number of PEs 20th ORAP Forum Slide 17

RAPS9 - T799 L91 10-day forecast OpenMP threads / MPI task on 96 Nodes 11 Percentage of Peak 10 9 8 7 6 5 SMT No SMT 4 Threads used for operations 1 2 3 4 5 6 7 8 Number of threads per MPI task 20th ORAP Forum Slide 18

RAPS9-10-day forecasts Message passing comms on hpce Resolution Nodes MPI x OMP WALL (secs) %Comms (barrier) Tflop/s % of peak T799 L91 T1279 L91 T799 L91 24 Nodes 96 x 8 96 Nodes 384 x 8 140 Nodes 560 x 8 4253 8.0% 0.38 13.0% 4836 11.5% 1.61 12.8% 995 18.9% 1.60 9.4% T1279 L91 140 Nodes 3506 13.8% 2.05 12.1% 560 x 8 20th ORAP Forum Slide 19

Ensemble forecasts of hurricane Katrina From 12UTC Thursday 25 Aug 2005 From 12UTC Friday 26 Aug 2005 Ensemble member High res. forecast Shading: Probability that Katrina would pass within 120km 20th ORAP Forum Slide 20

Integrated Forecasting System (IFS) IFS 1992 - today - Collaboration between Meteo France and - Source ~ 1.8 million lines - Fortran 95, some C - Good performance on scalar and vector systems IFS model characteristics: - Spectral - Semi-implicit - Semi-Lagrangian 20th ORAP Forum Slide 21

IFS - Parallelised using mixed MPI and OpenMP MPI communications Transpositions - Between Grid point, Fourier and Spectral spaces Wide halo exchange - Semi Lagrangian method - Radiation grid interpolation Long messages Typically MPI_ISEND/RECV/WAITALL or collective OpenMP Shared memory nodes Memory efficient Use 4/8 threads 20th ORAP Forum Slide 22

T L 799 1024 tasks 2D partitioning 2D partitioning results in non-optimal Semi-Lagrangian comms requirement at poles and equator! Square shaped partitions are better than rectangular shaped partitions. 20th ORAP Forum Slide 23

Model / Radiation Grids Radiation computations are expensive To reduce this cost we, - Run radiation computations every hour every 5 th timestep for T L 799 model - Run radiation computations on a courser grid T L 399 requires interpolation Two interpolation possibilities - Gather global fields to different tasks (non-scalable) global comms is bad; # fields can be less then # tasks - Perform interpolation with only local comms for halo (scalable) implemented in IFS this way 20th ORAP Forum Slide 24

Reduced grids (linear) &NAMRGRI NRGRI(1)= 18, NRGRI(2)= 25, NRGRI(3)= 36, NRGRI(4)= 40, NRGRI(5)= 45, NRGRI(6)= 50, NRGRI(7)= 60, NRGRI(8)= 64, NRGRI(9)= 72, NRGRI(10)= 72, NRGRI(11)= 75, NRGRI(12)= 81, NRGRI(13)= 90, NRGRI(14)= 96,... T L 399 note only factors 2, 3, and 5 for fourier transforms NRGRI(200)= 800,... NRGRI(398)= 36, NRGRI(399)= 25, NRGRI(400)= 18, / T799 model grid (blue) T399 radiation grid (red) 20th ORAP Forum Slide 25

PE=293, Radiation Grid T L 255 PE=293, Model Grid T L 511 Model and Radiation grids for same partition are offset geographically, because Use of reduced grid (linear) T L 255 is not a projection of T L 511 Long thin partitions make matters worse 20th ORAP Forum Slide 26

eq_regions algorithm 20th ORAP Forum Slide 27

Why eq_regions? eq_regions partitioning is broadly similar to existing IFS 2D partitioning - 2D A-Sets similar to eq_regions bands - 2D partitioning good for a regular lat - lon grid - eq_regions partitioning more suited to a reduced grid Only one new data structure required - N_REGIONS Code changes straightforward (example follows) eq_regions partitioning works for any number of tasks and not just task numbers that have nice factors 20th ORAP Forum Slide 28

Other partitioning approaches: e.g. quadrangles Difficult to implement in IFS (but not impossible). Nothing in common with 2D partitioning approach. C. Lemaire/J.C. Weill, March 23 2000, Partitioning the sphere with constant area quadrangles, 12 th Canadian Conference on Computational Geometry 20th ORAP Forum Slide 29

2D partitioning T799 1024 tasks (NS=32 x EW=32) 20th ORAP Forum Slide 30

eq_regions partitioning T799 1024 tasks N_REGIONS( 1)= 1 N_REGIONS( 2)= 7 N_REGIONS( 3)= 13 N_REGIONS( 4)= 19 N_REGIONS( 5)= 25 N_REGIONS( 6)= 31 N_REGIONS( 7)= 35 N_REGIONS( 8)= 41 N_REGIONS( 9)= 45 N_REGIONS(10)= 48 N_REGIONS(11)= 52 N_REGIONS(12)= 54 N_REGIONS(13)= 56 N_REGIONS(14)= 56 N_REGIONS(15)= 58 N_REGIONS(16)= 56 N_REGIONS(17)= 56 N_REGIONS(18)= 54 N_REGIONS(19)= 52 N_REGIONS(20)= 48 N_REGIONS(21)= 45 N_REGIONS(22)= 41 N_REGIONS(23)= 35 N_REGIONS(24)= 31 N_REGIONS(25)= 25 N_REGIONS(26)= 19 N_REGIONS(27)= 13 N_REGIONS(28)= 7 N_REGIONS(29)= 1 20th ORAP Forum Slide 31

2D partitioning T799 251 tasks (NS=251 x EW=1, 251 is a prime ) 20th ORAP Forum Slide 32

eq_regions partitioning T799 251 tasks N_REGIONS( 1)= 1 N_REGIONS( 2)= 7 N_REGIONS( 3)= 12 N_REGIONS( 4)= 17 N_REGIONS( 5)= 22 N_REGIONS( 6)= 25 N_REGIONS( 7)= 28 N_REGIONS( 8)= 27 N_REGIONS( 9)= 28 N_REGIONS(10)= 25 N_REGIONS(11)= 22 N_REGIONS(12)= 17 N_REGIONS(13)= 12 N_REGIONS(14)= 7 N_REGIONS(15)= 1 20th ORAP Forum Slide 33

T799 512 tasks, 2D, task 201 T799 model grid T399 radiation grid 20th ORAP Forum Slide 34

T799 512 tasks, eq_regions, task 220 T799 model grid T399 radiation grid 20th ORAP Forum Slide 35

Grid interpolation HALO area (512 tasks, T799=model grid, T399=radiation grid) 5000 4500 model to radiation grid HALO area 4000 3500 3000 2500 2000 1500 2D T799 to T399 eq_regions T799 to T399 2D T399 to T799 eq_regions T399 to T799 1000 500 0 radiation to model grid 1 33 65 97 129 161 193 225 257 289 321 353 385 417 449 481 task number Halo area includes tasks own grid points 20th ORAP Forum Slide 36

T799 performance (comparing 2D & eq_reqions) 2D eq_regions Application tasks x threads partitioning partitioning 2D / eq_regions secs secs model 512 x 2 3648 3512 1.039 4D-Var 96 x 8 3563 3468 1.027 Good: Bad: Reduced semi-lagrangian comms Reduced memory requirements Increased TRGTOL/TRLTOG comms (grid to fourier space) Less of an issue for thin nodes as relatively more comms is on switch 20th ORAP Forum Slide 37

summary eq_regions partitioning implemented in IFS - Both 2D and eq_regions partitioning are supported - eq_regions is the default partitioning - Available in IFS cycle CY31R2 eq_regions reduces semi-lagrangian communication cost - Also for model / radiation grid interpolation eq_regions has small performance advantage over 2D partitioning 20th ORAP Forum Slide 38

QUESTIONS? 20th ORAP Forum Slide 39