National e-infrastructure for Science Jacko Koster UNINETT Sigma 0
Norway: evita evita = e-science, Theory and Applications (2006-2015) Research & innovation e-infrastructure 1
escience escience (or Scientific Computing or Computational Science) is about distributed global collaboration in key areas of science, using the next generation research infrastructures now available. einfrastructure and escience are inseparable components of research. The borderline between einfrastructure and escience is not always clear: sometimes the technological tool is also the scientific tool. Until now, the building of services for escience has primarily taken place on a national basis, e.g., through national programmes on escience or through national einfrastructure providers. Globalization of einfrastructure and escience is pushed by EC. 2
e-science e-science Drives Basic Sciences 3
e-science e-science Drives Applied Science Environment Weather, Climatology Pollution Ageing Society Medicine Biology Materials Chemistry Nano-science Energy Plasma Physics Fuel Cells 4
e-science Weather, Climatology, Earth Science degree of warming, scenarios for future climate. understand and predict ocean properties and variations weather and flood events Astrophysics, Particle physics, Plasma physics structures that span a large range of different length and time scales quantum field theories Material Science, Chemistry, Nano-science complex materials, complex chemistry electronic and transport properties Life Science system biology, large scale protein dynamics, protein association and aggregation, supramolecular systems, medicine Engineering complex flight simulation, biomedical flows, gas turbines and internal combustion engines, forest fires. 5
e-science e-science Drives Competitiveness Reducing costs by prototyping Shorter time to result / market Allowing investigations where economics or ethics prohibit experimentation Imperative of scientific computing Validate theory Allowing (virtual) investigations that are impossible in real life 6
Research Infrastructure Research infrastructure refers to facilities, resources and related services of a unique nature that are used by the scientific community to conduct top-level research in their respective fields. This covers major scientific equipment or set(s) of instruments; knowledge based-resources such as collections, archives or structured scientific information; einfrastructure; human resources; any other entity of a unique nature essential to achieve excellence in research. Research infrastructures may be single-sited, distributed (a network of resources) or virtual (the services being provided electronically). 7
ESFRI The European Roadmap for Research Infrastructures (ESFRI) is the first comprehensive definition at the European level. As of today, 44 ESFRI initiatives have been proposed. Research Infrastructures are one of the crucial pillars of the future European Research Area. The e-infrastructure Reflection Group (e-irg) was founded to define and recommend best practices for the pan-european e-infrastructure efforts. 8 8
Trends : Research Infrastructure Research efforts require massive computing resources to tackle grand challenge and global problems in climate, medicine, Modern research is impossible without permanent access to highquality computers, networks, on-line library resources and research data, software tools for collaborative research (middleware), services for finding and accessing data, and applications to process and present research activity Easy access to (and good management of) scientific data is a growing requirement. Data must be saved for future shared use. 9
Norway: evita evita = e-science, Theory and Applications (2006-2015) Research & innovation e-infrastructure 10
e-infrastructure The term e-infrastructure is used for ICT-based technology, virtual organisations and associated services that support distributed global research collaborations in a fully integrated manner. Technologies include: - computer and storage facilities - on-line content (research data, ) - high-performance, high-capacity networks - software, grids, collaborative environments - support for software development and life-cycle management - tools to manage authentication, access and shared use of resources Using e-infrastructure, researchers can e.g., share data collections, compute resources, instruments, and environments for analysis. 11
e-infrastructure NSF vision on e-infrastructure: national-level integrated system of hardware, software, data resources and services to enable new paradigms of science. e-infrastructure is the integrating mechanism, the glue between different scientific disciplines and regions 12
e-infrastructure EUROPE NORWAY PARADE NORSTORE EGI NORGRID PRACE NOTUR GEANT UNINETT e-infrastructure: ubiquitous research environments for accessing and sharing resources and tools 13
National e-infrastructure University of Bergen University of Oslo University of Tromsø NTNU met.no UNINETT 5632 cores UiT HPC HPC: Notur: 2005 2014 Data: NorStore: 2007 2013 Grid: NorGrid: 2007 2010 NTNU 3040 cores HPC UiB HPC 5552 cores 10G 10G 10G 10G UiO HPC The national e-infrastructure is a distributed infrastructure with resources for computation and scientific data, plus corresponding operations and support, for science and research in Norway The infrastructure provides resources and services for - education and research at all Norwegian universities, university colleges and research institutes - operational forecasting by Meteorological Institute - research and engineering at research institutes and industry who wish to contribute. >2528 cores 14
National e-infrastructure National Infrastructure for High Performance Computing Notur II (2005-2014). Procurement, operations and support of HPC resources, various architectures (from clusters to MPP) National Infrastructure for Scientific Data NorStore (6/2007 6/2013). Infrastructure for scientific data collections, open for all research environments. Preservation of data. National Grid Initiative NorGrid (3/2007 12/2010). Environments for distributed (virtualized) computing. Coupling of resources in Notur and NorStore, improve utilization of resources, services for distributed data management. UNINETT operates the national hybrid network for research and education (10 Gbit/s backbone between largest universities) 15
National e-infrastructure Research projects Research infrastructure Infr. A Proj. B SFF C discipline-specific e-infrastructure HPC UiT general services, support 10G hardware, operations UiA Org B Inst C HPC SE NTNU 10G NorStore Notur NorGrid HPC UiB MS 10G 10G UiO HPC SE 16
Hardware njord.hpc.ntnu.no IBM p575+, 8-cpu dual-core nodes 3040 cores, power5+ 1.9 Ghz 120 TB storage IBM HPS interconnect AIX, LoadLeveler Installed 11/2006, upgrade 11/2009 hexagon.bccs.uib.no Total electricity usage > 1 MWh (excl. cooling) titan.uio.no Sun x2200, 2-cpu quad-core nodes 2528 cores, Opteron 2.3 Ghz 10 TB storage InfiniBand Linux RedHat Upgraded 10/2007 Cray XT4, quad-core nodes 5552 cores, Opteron 2.3 Ghz 288 TB storage Total 16700 cores, giving SeaStar2 interconnect ca. 150 million CPU-hours Linux (Cray variant) per year Installed 01/2008 stallo.uit.no HP, 2-cpu quad-core nodes 5632 cores, Xeon 2,66 Ghz 128 TB storage InfiniBand (55%) Linux RedHat Installed 11/2007 17
Grids The grid is a category of services and approaches to the research process which use einfrastructure in a particularly advanced way. The grid provides services for seamless distributed computation, aggregation of resources, and distributed data management Norwegian Grid Initiative: NorGrid EGEE: 2004-2010 EGI: 2010 NDGF: 2006-2010 NorGrid: 2007-2010 18
Access Access to the national e-infrastructure is by application: - Two calls per year - Proposals are evaluated by a Resource Allocation Committee appointed by the Research Council of Norway One can apply for access to: - HPC resources (Notur) - Storage resources and related services (NorStore) - Advanced user support: for porting of applications, complex application enabling. etc. The e-infrastructure does not develop application software. This remains the reponsibility of the researcher 19
Nordic e-infrastructure Nordic Data Grid Facility (2006-2010) Collaboration between Nordic countries to increase grid collaboration and implement Nordic contribution to World-wide Large Hadron Collider Grid - WLCG. Builds on national projects: SweGrid/SNIC, M-Grid/CSC, Grid.dk/DCSC, NorGrid/UNINETT. Partner countries: Denmark, Finland, Norway, Sweden. Iceland joined in 2009. 20
Nordic DataGrid Facility April 2008 21
European HPC Infrastructure EC ambition: Climate Medicine Make Europe a leading player in HPC... Strengthen competitiveness; face global challenges Deploy top-class ecosystem of computational resources European countries to scale & pool investments EC to define and support ambitious, broad strategic agenda Energy Implementation requires reinforced and coordinated efforts of countries, EC and scientific communities 22
European HPC Infrastructure Performance pyramid: 23
European HPC Infrastructure 24
PRACE Partnership for Advanced Computing in Europe PRACE is the only ESFRIInitiative on e-infrastructure. Ambition: - Late 2009: 1 Petaflop system in the top 5 - Late 2010: 2 Petaflop systems in the top 5; 1 Petaflop system in top 10-2011: over 10 Petaflop in the top 5-2020: the Exaflop in the top 5 25
PRACE roadmap Preparatory Phase: 10 M / yr (50% from EU) Implementation Phase: 100-150 M / yr (20 M / yr from EU) 26
European DCI An effort to establish a sustainable grid infrastructure in Europe, starting 2010. Will succeed EGEE-III (2008-2010). Foundations of EGI are the National Grid Initiatives (NGIs) that operate the grid infrastructures in each country. EGI Design Study (2007-2009): - identify processes for establishing EGI - define functions, organization, funding models - initiate the construction of EGI Norwegian NGI = NorGrid Consortium of UiO, UiT, UiB/Unifob, UNINETT/Sigma 27
European DCI EGI is the Distributed Computing Infrastructure for Europe EGI = EGI.eu + NGIs + Europe s largest research organizations - CERN - EMBL -... Specialized Support Centres (SSCs) are defined on top of EGI, e.g., for HEP, chemistry, material science, fusion,... 28
European Scientific Data PARADE is a consortium of national e-infrastructure projects to establish European infrastructure for scientific data. Large research infrastructures are interested, including ESFRI/CLARIN, EMBL,... The goal is that this effort eventually will lead to a new Europeanwide permanent infrastructure on scientific data, that co-exists with EGI (distribute computing) and PRACE (high-end computing). 29
For more information: jacko.koster@uninett.no www.notur.no 30