High-performance computing for soil moisture estimation

Similar documents
Towards Global Monitoring of Soil Moisture at 1 km Spatial Resolution using Sentinel-1: Initial Results

Towards Sentinel-1 Soil Moisture Data Services: The Approach taken by the Earth Observation Data Centre for Water Resources Monitoring

COPERNICUS COLLABORATIVE GROUND SEGMENT TO SUPPORT MARITIME SITUATIONAL AWARENESS

The Sentinel-1 Constellation

Establishment of a Multiplexed Thredds Installation and a Ramadda Collaboration Environment for Community Access to Climate Change Data

ACTIVE MICROWAVE REMOTE SENSING OF LAND SURFACE HYDROLOGY

Enabling Scientific Breakthroughs at the Petascale

Co-ReSyF RA lecture: Vessel detection and oil spill detection

SAR missions for oceanography at the European Space Agency

Sentinel-1 Overview. Dr. Andrea Minchella

Remote Sensing for Epidemiological Studies

How to Access EO Data

Big Data Best Practice

How to access EO data

Intel and XENON Help Oil Search Dig Deeper Into Sub-Surface Oil and Gas Analysis

The Five R s for Developing Trusted Software Frameworks to increase confidence in, and maximise reuse of, Open Source Software

raw format format for capturing maximum continuous-tone color information. It preserves all information when photograph was taken.

The use of satellite images to forecast agricultural

Satellite Observations of Nonlinear Internal Waves and Surface Signatures in the South China Sea

EE 529 Remote Sensing Techniques. Introduction

Synthetic Aperture Radar for Rapid Flood Extent Mapping

Envisat and ERS missions: data and services

ASSESSMENT BY ESA OF GCOS CLIMATE MONITORING PRINCIPLES FOR GMES

Microwave Sensors Subgroup (MSSG) Report

ASAR WIDE-SWATH SINGLE-LOOK COMPLEX PRODUCTS: PROCESSING AND EXPLOITATION POTENTIAL

Configuration, Capabilities, Limitations, and Examples

RADARSAT-2/RCM Synthetic Aperture Radar data exploitation for military geomatics via ArcGIS

ERS/ENVISAT ASAR Data Products and Services

Impact from Industrial use of HPC HPC User Forum #59 Munich, Germany October 2015

HEMERA Constellation of passive SAR-based micro-satellites for a Master/Slave configuration

Sentinel-1 Data Border Noise Removal and Seamless Synthetic Aperture Radar Mosaic Generation

Futrajaya, Malaysia JULY 12, Jeong Heon SONG. Korea Aerospace Research Institution

KONGSBERG SATELLITE SERVICES Earth Observation for Maritime Operations Current Capabilities and Future Potential

Remote Sensing Analysis Framework for Maritime Surveillance Application

Satellite data for Maritime Operations. Andreas Hay Kaljord Project Manager Energy, Environment & Security

Sentinel-1 Calibration and Performance

ESA Climate Change Initiative- Soil Moisture (CCI SM): Serving our users lessons for Copernicus Climate Change Service

Canada s Most Powerful Research Supercomputer Niagara Fuels Canadian Innovation and Discovery

RESERVOIR MONITORING USING RADAR SATELLITES

The power of the Cloud & the power of Partnerships Using satellite imagery in the SDGs: the case of 6.6.1

Microwave Sensors Subgroup (MSSG) Report

SAOCOM-CS Mission and ESA Airborne Campaign Data

MERIS US Workshop. 14 July 2008

The Who. Intel - no introduction required.

ISIS TC Meeting. International Spaceborne Imaging Spectroscopy (ISIS) GRSS Technical Committee Meeting, 16/07/2014, IGARSS 2014

EKATERINA TYMOFYEYEVA GMTSAR BATCH PROCESSING

New capabilities in Earth Observation for agriculture

Introduction to Radar

Document downloaded from:

Active and Passive Microwave Remote Sensing

Copernicus Introduction Lisbon, Portugal 13 th & 14 th February 2014

A virtual On Board Control Unit for system tests

Marine Earth Observation & Applications at University College Cork

VtWeb - Multi-data access / processing / sharing / export A door open to on-the-fly (cross) calibration

2016 Winter. / ASF News & Notes / 2016 Winter

Scientific Computing Activities in KAUST

High Performance Computing Facility for North East India through Information and Communication Technology

STATUS OF CURRENT AND FUTURE RUSSIAN SATELLITE SYSTEMS by Roscosmos / Roshydromet. Presented to CGMS-45 plenary session

KOMPSAT Constellation. November 2012 Satrec Initiative

Contribution to the Smecy Project

KONGSBERG SATELLITE SERVICES 2017 Line Steinbakk, Director Programs. Himmel og hav - Ålesund 3. Oktober 2017

Setting up a Digital Darkroom A guide

Sentinels Data Collection

Communications Planner for Operational and Simulation Effects With Realism (COMPOSER)

Biomass, a polarimetric interferometric P-band SAR mission

Experience with new architectures: moving from HELIOS to Marconi

Soil moisture retrieval using ALOS PALSAR

Active and Passive Microwave Remote Sensing

KONGSBERG. WORLD CLASS through people, technology and dedication WORLD CLASS through people, technology and dedication

ROLE OF SATELLITE DATA APPLICATION IN CADASTRAL MAP AND DIGITIZATION OF LAND RECORDS DR.T. RAVISANKAR GROUP HEAD (LRUMG) RSAA/NRSC/ISRO /DOS HYDERABAD

9/12/2011. Training Course Remote Sensing Basic Theory & Image Processing Methods September 2011

European Space Agency and IPY

IMPACT OF BAQ LEVEL ON INSAR PERFORMANCE OF RADARSAT-2 EXTENDED SWATH BEAM MODES

Review. Guoqing Sun Department of Geography, University of Maryland ABrief

RADARSAT-2 Program Update Daniel De Lisle Canadian Space Agency

DICELIB: A REAL TIME SYNCHRONIZATION LIBRARY FOR MULTI-PROJECTION VIRTUAL REALITY DISTRIBUTED ENVIRONMENTS

SARscape for ENVI. A Complete SAR Analysis Solution

Multispectral Scanners for Wildland Fire Assessment NASA Ames Research Center Earth Science Division. Bruce Coffland U.C.

Earth Observation and Sensing Technologies: a focus on Radar Imaging Developments. Riccardo Lanari

Real Time Visualization of Full Resolution Data of Indian Remote Sensing Satellite

The Radar Ortho Suite is an add-on to Geomatica. It requires Geomatica Core or Geomatica Prime as a pre-requisite.

SARscape Modules for ENVI

Data acquisition and access for the Congo Basin

GNSS-Reflectometry for Observation and Monitoring of Earth surface

Francesco Holecz. TUBE II meeting - 17 June Land Degradation. Land Degradation

Advances in the Processing of VHR Optical Imagery in Support of Safeguards Verification

A CONCEPT FOR NATURAL GAS TRANSMISSION PIPELINE MONITORING BASED ON NEW HIGH-RESOLUTION REMOTE SENSING TECHNOLOGIES

DOST- ASTI Initiatives on the Development of Monitoring Stations and Application of Satellite Technology in Philippine Agriculture

DATA STEWARDSHIP A FUNDAMENTAL PART OF THE SCIENTIFIC METHOD. Clinton Foster, Jonathon Ross, Lesley Wyborn

Development of Innovation Strategy and Patent Systems. Paik Saber Assistant General Counsel, IP Law IBM Asia Pacific

High resolution ground deformations monitoring by COSMO-SkyMed PSP SAR interferometry

High Performance Computing and Visualization at the School of Health Information Sciences

Distributed Computing on CubeSat Clusters using MapReduce

ENVISAT ASAR MONTHLY REPORT MARCH 2012

Airborne Recording from a Ground Station Perspective

Satellite EO for the Insurance Sector: New Technologies and Opportunities

remote sensing? What are the remote sensing principles behind these Definition

Tour the World with World Wind By Rob Rice

Warren Cartwright, Product Manager MDA Geospatial Services, Canada

How to Access Imagery and Carry Out Remote Sensing Analysis Using Landsat Data in a Browser

Transcription:

High-performance computing for soil moisture estimation S. Elefante 1, W. Wagner 1, C. Briese 2, S. Cao 1, V. Naeimi 1 1 Department of Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria 2 EODC Earth Observation Data Centre for Water Resources Monitoring, Vienna, c/o University of Technology, Vienna, Austria BIDS Conference, Santa Cruz de Tenerife, 15-17 March 2016

Contents Introduction Case Studies Case 1: Envisat ASAR WS (150m) Case 2: Sentinel-1 C-SAR IW HR (20m) Lesson learned & Outlook 2

Surface Soil Moisture Estimation Preprocessing Parameters 3

Earth Observation Satellites for Microwave Missions 4

The Earth Observation Data Center (EODC) The EODC is a private-public partnership aiming at providing a collaborative IT infra-structure for archiving, processing, and distributing EO data (www.eodc.eu) Through multi-national partnerships from science, the public and private sectors users can get direct access to EO data storage and running data-intensive geoscientific models TU Wien is an EODC partner 5

EODC environment Virtual Machines (VMs) Supercomputer 24/7 Operations & Rolling Archive Petabyte-Scale Disk Storage Tape Storage 6

The Vienna Scientific Cluster (VSC-3) 2020 nodes with 64 GBytes of RAM Each node equipped with 2 (8 cores) CPUs (Intel Xeon Processor E5-2650 v2 from the Ivy Bridge-EP family) Middleware installed: Simple Linux Utility for Resource Management (SLURM) Distributed volume managed by the BeeGFS Energy-efficient cooling is provided by the mineral-oil based CarnotJet System VSC-3 was ranked 85th in the November 2014 worldwide TOP500 supercomputer list 7

Contents Introduction Case Studies Case 1: Envisat ASAR WS (150m) Case 2: Sentinel-1 C-SAR IW HR (20m) Lesson learned & Outlook 8

Preprocessing chain on VSC-3 Sentinel 1 Toolbox (S1TBX) Version 1.1.1 Python 2.7 SAR Geophysical Toolbox (SGRT) 2.2 9

Case 1: Envisat ASAR WS (150m) Read VSC-3 Write S1 & ASA Archive (testing NFS solution) VSC-3 BeeGFS storage Input: Data type: Envisat ASAR WS (150m) Number of files: 31,199 Single input dataset size: 12 692 MB Total input data size: 5.401 TB 20% of the 8 years ASA WS archive!! 10

The processing Happy to start the processing But The jobs crashed when around 200 nodes were simultaneously working!!! 11

Why (Possible reasons)? Input: 200 nodes reading simultaneously files from S1 & ASA NFS archive Output: 200 nodes * 30 files = 6000 files to write on the BeeGFS storage The Sentinel & Envisat Archive (NFS testing) 93 NFS disks (83 x 4T, 10 x 8T) Connected through high-speed InfiniBand (40 Gb/sec) to the VSC-3 system The distributed BeeGFS VSC-3 volume 360 spinning disks Connected through around 160 Gb/sec bandwidth (evaluated experimentally) 12

What if I use the RAM disk/cache? Data can be temporarily stored in the RAM disk/cache Advantages Data access is much faster than any physical storage Disadvantages Part of the RAM available on the node will be used to store the data (I do not have 64 GBytes of RAM available anymore) 13

Case 1: Output data in the RAM disk (Caching) Read VSC-3 Write S1 & ASA Archive (testing NFS solution) Caching the output data VSC-3 BeeGFS storage Jobs: Number of independent jobs: 15,600 Number of images for each Job: 2 The output data are temporary stored in the RAM disk (Caching) Only at the end of the processing data are copied on BeeGFS 14

Case 1: SLURM Processing Queue 450 nodes assigned Start of the processing End of the processing 15

Cade 1: Processing speed caching input data Poor Scalability!! Average processing time Processing time significantly change for a given data file size 16

Case 1: Input and output data in the RAM disk (Caching) Read VSC-3 Write S1 toolbox S1 & ASA Archive keeps open (testing NFS solution) the files Caching the input and output data VSC-3 BeeGFS storage The input image and output data are temporary stored in the RAM disk (Caching) Only at the end of the processing data are copied on BeeGFS 17

Case 2: SLURM Processing queueing > 600 nodes assigned 18

Case 2: Processing speed caching input and output data Great Scalability!! 19

Contents Introduction Case Studies Case 1: Envisat ASAR WS (150m) Case 2: Sentinel-1 C-SAR IW HR (20m) Lesson learned & Outlook 20

From the previous experiments we learned that We need to cache input and output data The storage volume can handle the massive data amount when input and output data are stored in the RAM disk (Caching) Can we evaluate the time to read/write on a given storage Sentinel data? 21

Reading Performance: NFS storage and BeeGFS 10 minutes with 170 nodes!! 1 minute with 500 nodes 600 S1 & ASA storage (NFS testing solution) 60 VSC-3 storage (BeeGFS) Elapsed time to read an image (sec) 500 400 300 200 100 Elapsed time to read an image (sec) 50 40 30 20 10 0 0 50 100 150 200 0 0 100 200 300 400 500 600 Number of nodes Number of nodes A standard sample of S1-A images has been considered Size around 1 GByte and range 0.7 1.7 GByte 22

Case 2: Sentinel-1 IW HR Preprocessing Input: Jobs: Data type: Sentinel-1 IW GRD HR (20m) Number of files: 1,075 Single input dataset size: 0.8 1.7 GB Total input data size: 1.2 TB Number of independent jobs: 1,075 Number of images for each Job: 1 0 3.5 h Caching the input image and output data. Only at the end of the processing data are copied on BeeGFS 23

Case 2: S1 Processing speed caching input and output data Great Scalability!! 50 minutes to process S1 data (to improve) 24

Examples of Processing on VSC-3 90% of the 8 years ASA GM archive in 3.5 days instead of 167 days!! Test n. 1 n. 2 n. 3 n. 4 SAR product mode ASAR GM ASAR WS ASAR WS S-1 IW GRDH Spatial resolution 1 km 150 m 150 m 20 m Total number of data files 189,621 31,199 31,199 1,075 Number of images for job / Total Number of jobs 8 / 23,703 2 / 15,600 2 / 15,600 1 / 1,075 Input data file size range 1-73 MB 12-692 MB 12-692 MB 0.8 1.7 GB Total input data files size 1.579 TB 5.401 TB 5.401 TB 1.2 TB Max. number of simultaneous running nodes 417 454 612 396 Number of cores used by Sentinel-1 Toolbox 4 8 8 8 Input data caching on node False False True True 20% of the 8 years ASA WS archive in 8 hours instead of 353 days!! Output data caching on node True True True True Averaged processing time (seconds/mb) 9.18 5.65 2.39 2.69 Elapsed time including SLURM queueing Estimated elapsed time using only 1 node 3.5 days 4 days 8 hours 3.5 hours 167 days 353 days 353 days 37 days 25

Contents Introduction Case Studies Case 1: Envisat ASAR WS (150m) Case 2: Sentinel-1 C-SAR IW HR (20m) Lesson learned & Outlook 26

Lesson learned & Outlook Lesson learned Proved the capability to process EO Big Data within EODC platform 90% of the 8 years ASA GM archive in 3.5 days instead of 167 days 20% of the 8 years ASA WS archive in 8 hours instead of 353 days Optimal processing strategy is to temporary store the input and the output data in the RAM disk (Caching) during the processing The bottleneck for EO Big Data processing is the data I/O Outlook A new archive/storage volume will be put in place this year Reduce the processing time for each image (e.g. using multicore processing) Processing data at global scale 27

Thank you very much for your attention! The authors would like to thank for their help and support: B. Bauer-Marschallinger, R. Brunnthaler, A. Dostalova, S. Hahn, S. Hasenauer, H. Thüminger This study was supported by the Austrian Research Promotion Agency (FFG) through P4EODC - Preparing for the Initial Services of the EODC (project number 36786), WetMon - Enabling an operational Sentinel-1 wetland monitoring service for the EODC (project number 848001) and by EUMETSAT through Satellite Application Facility on Support to Hydrology and Water Management (H-SAF). http://rs.geo.tuwien.ac.at/remote-sensing/

Earth Observation Satellites for Microwave Missions 29

From Byte to PetaByte Processing Challenge 1 Byte 1 KiloByte 1 MegaByte 1 GigaByte 1 TeraByte 1 PetaByte

Current employed satellites data Envisat Instrument: Advance Synthetic Aperture Radar (ASAR) Period: 1st March 2002 8th April 2012 Acquisition Mode: Sentinel-1 Global Monitoring (GM) with 1km resolution Wide Swath (WS) with 150m resolution Instrument: C-Band SAR Period: Sentinel-1A : Launched on 3rd April 2014 Sentinel-1B : planned to be launched in 2016 Acquisition Mode: Interferometric Wide Swath (IW) with High Resolution(HR) 20m 31

From Byte to PetaByte Processing Challenge 1 Byte 1 KiloByte 1 MegaByte 1 GigaByte 1 TeraByte 1 PetaByte

Case 1: Envisat ASAR GM (1km) VSC-3 NFS Archive Internal BeeGFS storage Input: Jobs: Data type: Envisat ASAR GM (1km) Number of files: 189,621 Single input dataset size: 1 73 MB Total input data size: 1.579 TB Number of independent jobs: 23,703 Number of images for each job: 8 33

SLURM Processing Queue 34

Case 1: Processing Speed Poor Scalability!! 35

Directly reading vs caching Read from cache data Read directly from NFS 36