Evans NCI 20161013v2

1 downloads 0 Views 20MB Size Report
Wye Valley and. Lorne Fires. 25-31 Dec, 2015. Bush Fires. Examples: • Modelling Extreme and High Impact events – BoM. • NWP, Climate Coupled Systems ...

The Dawn of the Exascale Age: Using Integrated HPC and Connected Data Dr Ben Evans

nci.org.au @NCInews nci.org.au

What is Exascale – more than ExaFLOP (10^18 operations per second) • •

Exascale practically means addressing multiscale science problems at 1000 times better than achievable on current petaflop systems. US National Strategic Computing Initiative (NSCI) (July 29, 2015) to maximise the benefits of HPC for US economic competitiveness & scientific discovery. https://www.whitehouse.gov/the-press-office/2015/07/29/executive-order-creating-national-strategic-computing-initiative





The NSCI is a whole-of-government effort designed to create a cohesive, multiagency strategic vision and Federal investment strategy, executed in collaboration with industry and academia, to maximize the benefits of HPC for the United States. There are three lead agencies for the NSCI: the Department of Energy (DOE), the Department of Defense (DOD), and the National Science Foundation (NSF). – The DOE Office of Science and DOE National Nuclear Security Administration will execute a joint program focused on advanced simulation through a capable exascale computing program emphasizing sustained performance on relevant applications and analytic computing to support their missions. – NSF will play a central role in scientific discovery advances, the broader HPC ecosystem for scientific discovery, and workforce development. – DOD will focus on data analytic computing to support its mission. © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Principles of NSCI



Coordinated Federal strategy guided by four principles: 1. The United States must deploy and apply new HPC technologies broadly for economic competitiveness and scientific discovery. 2. The United States must foster public-private collaboration, relying on the respective strengths of government, industry, and academia to maximize the benefits of HPC. 3. The United States must adopt a whole-of-government approach that draws upon the strengths of and seeks cooperation among all executive departments and agencies with significant expertise or equities in HPC while also collaborating with industry and academia. 4. The United States must develop a comprehensive technical and scientific approach to transition HPC research on hardware, system software, development tools, and applications efficiently into development and, ultimately, operations. Directed to capable scientific computing – science and mission applications, scalable software stack and data software, integrated engineering for supercomputer systems. © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Engagement with Societal Application Agencies



Deployment Agencies. There are five deployment agencies for the NSCI: • • • • •

the National Aeronautics and Space Administration (NASA), the Federal Bureau of Investigation (FBI), the National Institutes of Health (NIH), the Department of Homeland Security (DHS), and the National Oceanic and Atmospheric Administration (NOAA).

– Agencies participate in the co-design process to integrate the special requirements of their respective missions and influence the early stages of design of new HPC systems, software, and applications. – Agencies will also have the opportunity to participate in testing, supporting workforce development activities, and ensuring effective deployment within their mission contexts.

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

NCI High Performance Scaling activities 2014-16 supported by Fujitsu •

Objectives: • Upscale and increase performance of high-profile community codes – especially for Government and Research community

• Year 1 • Characterise and scale critical BoM weather and climate operational applications for higher resolution • Best practise configuration for improved throughput • Establish analysis toolsets and methodology

• Year 2 • Characterise, Optimise and Tune of next generation high priority applications • Select high priority earth systems and geophysics codes for scalability. • Parallel Algorithm Review and I/O optimisation methods for Next-gen scaling

• Year 3 • Assess codes for scalability, encourage adoption from “best in discipline” • new software and hardware technologies to better assess performance & energy efficiency © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Exascale Earth Systems Science Research and Societal Impacts Examples: • Modelling Extreme and High Impact events – BoM • NWP, Climate Coupled Systems and Data Assimilation – BoM, CSIRO, Uni’s. • Hazards - Geoscience Australia, BoM, States • Geophysics – Geoscience Australia, Universities • Monitoring the Environment and Ocean – ANU, BoM, CSIRO, GA, IMOS, TERN, States • International research – Universities Tropical Cyclones

Cyclone Winston 20-21 Feb, 2016

Volcanic Ash

Manam Eruption 31 July, 2015

© National Computational Infrastructure 2016

Bush Fires

Wye Valley and Lorne Fires 25-31 Dec, 2015

Flooding

St George, QLD February, 2011

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

ACCESS - Numerical Weather Prediction (NWP) A prime requirement for the Bureau of Meteorology will is to provide vastly improved

weather prediction, including better resolution and severe events: • • • •

High resolution: 1-1.5km Data assimilation Rapid update cycles High resolution ensembles APS-2

APS-3

APS-4

Op: 2016

Op: 2017-2018

Op: 2019-2020

ACCESS-G

25km {4dV}

12km {4dVH}

12km {4dVH/En}

ACCESS-R

12km {4dV}

8km {4dVH}

4.5km {4dVH/En}

ACCESS-TC

12km {4dV}

4.5km {4dVH}

4.5km {4dVH}

ACCESS-GE

60km (lim)

30km

30km

ACCESS-C

1.5km {FC}

1.5km {4dVH}

1.5km {4dVH/En}

ACCESS-CE

-

2.2km (lim)

1.5km

ACCESS-X

-

1.5km {4dVH}

1.5km {4dVH/En}

ACCESS-XE

© National Computational Infrastructure 2016

-

1.5km

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

International UK Unified Model Development Partnership

Shared science, model evaluation and technical development: • Joint process evaluation groups • Technical infrastructure teams • User workshops & tutorials

A foundation for relationships with other organisations: • Science & model development • Weather & climate services • Jointly growing with businesses

Operational users complemented by: • research partners in national / international universities & organisations • capacity building consultancy projects with other partners

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Fully coupled Earth System Model

Core Model • • • •

Atmosphere

Terrestrial

Atmospheric chemistry

Coupler

Carbon

Atmosphere – UM 10.4 Ocean – MOM 5.1 Sea-Ice – CICE5 Coupler – OASIS-MCT

Carbon cycle (ACCESS-ESM1) • Terrestrial – CABLE • Bio-geochemical • Couple to modified ACCESS1.3

Ocean Oceanand andsea-ice sea-ice

Aerosols • UKCA • Couple to ACCESS-CM2

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

IPCC Climate Reports: CMIP1 through to CMIP5 Data Volumes

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Global infrastructure supporting reproducible scientific analysis Earth System Grid Federation: Exemplar of an International Collaboratory for large scientific data and analysis

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

What is the difference between 0.25 and 0.1 degree?

Now need to move to 0.03 degree and more coupled systems © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

The Queensland storm surge - sea surface height using ROMS.

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Himawari-8 Observations, Data Assimilation and Analysis Captured at JMA, Processed after acquisition at BoM Made available at NCI Data Products still to be generated, but first stage was to make the image data available.

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Earth Observation Time Series Analysis •



Over 300,000 Landsat scenes (spatial/temporal) allowing flexible, efficient, large-scale in-situ analysis Spatially-regular, time-stamped, band-aggregated tiles presented as temporal stacks.

Continental-Scale Water Observations from Space

WOFS water detection

Spatially partitioned tiles



27 Years of data from LS5 & • LS7(1987-2014)



25m Nominal Pixel Resolution

Temporal Analysis •

© National Computational Infrastructure 2016

Approx. 300,000 individual • • source ARG-25 scenes in approx. 20,000 passes

Ben Evans, eResearch Conference, Oct 2016

Entire 27 years of 1,312,087 ARG25 tiles => 93x1012 pixels visited 0.75 PB of data 3 hrs at NCI (elapsed time) to compute.

nci.org.au

EU Copernicus Sentinel Earth Observation: • • •

Six families of satellites: Sentinels 1-6 progressively from 2014 – Monitoring of land, ocean, vegetation, soil, altimetry, etc. Australia to provide the regional data access and analysis hub Consortium: GA, CSIRO, State Govt. agencies (WA, NSW, Qld)

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Combining HPC &HPD: Prediction of hazards at local scales • • •

Modelling tropical cyclones to capture peak wind speed near the eye require 1-2 km resolution calculations Impacts of hazard events vary at 30 metre scale — landscape variations (topography/land cover) Risk analysis requires large ensembles (106) to be modelled and impacts analysed at landscape scales

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016 Be Evans, AMSI Conference, June 2016

nci.org.au

Emerging Petascale Geophysics codes -

Assess priority Geophysics areas - 3D/4D Geophysics: Magneto-tellurics, AEM - Hydrology, Groundwater, Carbon Sequestration - Forward and Inverse Seismic models and analysis (onshore and offshore) - Natural Hazard and Risk models: Tsunami, Ash-cloud

-

Issues - Data across domains, data resolution (points, lines, grids), data coverage - Model maturity for running at scale - Ensemble, Uncertainty analysis and Inferencing

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Multi-Year Strategic Science and Services Plan - Total Water Prediction FY 19-24 and National Water Centre Major Integration c/- David Maidment – CUAHSI and the National Flood Interoperability Experiment

FY 17-22 Major Integration

FY 16-21 Key Enhancement

FY 15-20 Core Capability

Centralized Water Forecasting • National Water Model (NWM) operational May 2016 ² Water forecasts for 2.7 million stream reaches in U.S.

Flash Flood and Urban Hydrology • Enhance NWM with nested hyperresolution zoom capability and urban hydrologic processes ² Heightened focus on regions of interests (e.g. follow storms)

Coastal Total Water Level

• Couple NWM with marine models to predict combined storm surge, tide, and riverine effects ² More complete picture of coastal storm impacts

Key Enhancement

Dry Side: Drought and Post-Fire • Couple NWM with groundwater and transport models to predict low flows, drought and fire impacts ² Add NWM processes that affect subsurface water movement and storage during dry conditions

² Water prediction information linked to geospatial risk and vulnerability

² Add NWM ability to track constituents (e.g. sediment, contaminants, nutrients) through stream network ² New decision support services for water shortage situations and waterborne transport

² 100 million people get a terrestrial water forecast for first time

² Street level flood inundation forecasts for selected urban demonstration areas

² New service delivery model implemented – increased stakeholder engagement and integrated information

² National Water Center (NWC) begins providing daily situational awareness and guidance to NWS field offices

² NWC increases guidance to NWS field offices to improve consistency and services for flash floods

² NWC operations center opens and provides national decision support services and situational awareness

© National Computational Infrastructure 2016

FY 18-23

² NWC operations center expands to include drought and post-fire decision support services

19

Ben Evans, eResearch Conference, Oct 2016

Water Quality • Integrate enhanced NWM with key water quality data sets, models and tools to begin water quality prediction ² Incorporate water quality data from federal and State partners into NWM ² Link NWM output to NOAA ecological forecasting operations ² New decision support services for predicting water quality issues such as Harmful Algal Blooms ² New decision support services for emergencies such as chemical spills ² NWC operations center expands to include water quality decision support services

nci.org.au

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Building Genomics data analysis and sharing platforms

The arrival of the “$1,000” genome

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

NCI National Reference Earth Systems Datasets NCI Proposal to NCRIS RDSI (RDS) for a High Performance Data Node to: •

Enable dramatic increases in the scale and reach of Australian research by providing nationwide access to enabling data collections;



Specialise in nationally significant research collections requiring high-performance computational and data-intensive capabilities for their use in effective research methods;



Realise synergies with related national research infrastructure programs

As a result, Researchers will be able to: •

share, use and reuse significant collections of data that were previously either unavailable to them or difficult to access



access the data in a consistent manner which will support a general interface as well as discipline specific access



use the consistent interface established/funded by this project for access to data collections at participating institutions and other locations as well as data held at the Nodes

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

NCI National Earth Systems Research Data Collections 1. Climate/ESS Model Assets and Data Products 2. Earth and Marine Observations and Data Products 3. Geoscience Collections 4. Terrestrial Ecosystems Collections 5. Water Management and Hydrology Collections http://geonetwork.nci.org.au Data Collections

Approx. Capacity

CMIP5, CORDEX, ACCESS Models

5 Pbytes

Satellite Earth Obs: LANDSAT, Himawari-8, Sentinel, MODIS, INSAR

2 Pbytes

Digital Elevation, Bathymetry Onshore/Offshore Geophysics

1 Pbytes

Seasonal Climate

700 Tbytes

Bureau of Meteorology Observations

350 Tbytes

Bureau of Meteorology Ocean-Marine

350 Tbytes

Terrestrial Ecosystem

290 Tbytes

Reanalysis products

100 Tbytes

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

NCI’s Australian Geophysics Data Collection Australian Geophysics Data Collection

Collection National Grids

Gravity

Mag

Radiometrics

Theme

Gravity

Mag

Radiometrics

AEM

Seismic

MT

Seismology

Survey

Survey 1

Survey 2

Survey 3

Survey 4

Survey 5

Survey 6

Survey n

Grids

20 m

40 m

30 m

80 m

10 m

50 m

10 m

50 m

20 m

Lines Points

See Lesley Wyborn’s talk on Thursday © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

40 m

Data Classified Based On Processing Levels Level*

points grids

Proposed Name

Description*

0

Raw Data

Instrumental data as received from sensor. Includes any and all artefacts.

1

Instrument Data

Instrument data that have been converted to sensor units but are otherwise unprocessed. Data includes appended time and platform georeferencing parameters (e.g., satellite ephemeris).

2

Calibrated Data

Data that has undergone corrections or calibrations necessary to convert instrument data into geophysical value. Data includes calculated position.

3

Gridded Data

Data that has been gridded and undergone minor processing for completeness and consistency (i.e., replacing missing data).

4

“Value-added” Data Products

Analytical (modelled) data such as those derived from the application of algorithms to multiple measurements or sensors.

5

Model-derived Data Products

Data resulting from the simulation of physical processes and/or application of expert knowledge and interpretation.

HPD

*The level numbers and descriptions above follow definitions used in satellite data processing, as defined by NASA. (see ; ; ). © National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Enable global and continental scale as well as to local/catchment/plot scalescale-down

• NWP and Forecasts UM, APS3 (Global, Regional, City), ACCESS-TC • Coupled Seasonal and Decadal Climate ACCESS-GC2/3 (GloSea5) • Data Assimilation 3D-VAR, 4D-VAR (Atmosphere), EnKF (Ocean)

• • • Ocean Forecasting and Research • OceanMaps, BlueLink, MOM5, CICE/SIS, WW3, • ROMS • Fully-Coupled Earth System Model • ACCESS-CM, ACCESS-ESM, CMIP5/6 © National Computational Infrastructure 2016

Water availability and usage over time Catchment zone Vegetation changes Data fusion with point-clouds and local or other measurements Statistical techniques on key variables

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Transform data to become transdisciplinary and born-connected

• A call to action for a Transdisciplinary approach starting at the conception of data collections • Researchers across the science disciplines, the social sciences and those beyond academia need to work together to enable horizontal interoperability for: -> high end researchers, students and the general public. • Then achieve interoperability and information will be accessible to all sectors

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

National Earth Systems Research Data Interoperability Platform: a simplified view Compute Intensive

Virtual Laboratories

Fast/Deep Data Access

NERDIP Data Platform Server-side Data functions Services

© National Computational Infrastructure 2016

Portal views

Ben Evans, eResearch Conference, Oct 2016

Machine Connected

Program access

nci.org.au

National Environmental Research Data Interoperability Platform (NERDIP) Biodiversity & Climate Change VL

Climate & Weather Science Lab

eMAST Speddexes

eReefs

AGDC VL

All Sky Virtual Observatory

VGL

Globe Claritas

VHIRL

Open Nav Surface

Workflow Engines, Virtual Laboratories (VL’s), Science Gateways Ferret, NCO, GDL, Fortran, C, C++, Python, R, Models GDAL, GRASS, QGIS MPI, OpenMP MatLab, IDL

Visualisation Drishti

ANDS/RDA AODN/IMOS TERN AuScope Portal Portal Portal Portal

Data. gov.au

Digital Bathymetry & Elevation Portal

Tools Data Portals

National Environmental Research Data Interoperability Platform (NERDIP) Open DAP

OGC W*TS

OGC SWE

OGC W*PS

OGC

netCDF-CF

WCS

OGC WFS

OGC WMS

RDF, LD

Data Conventions

Fast “whole-of-library” catalogue

CS-W

Direct Access

Services Layer

Vocab PROV Service Service

ISO 19115, ACDD, RIF-CS, DCAT, etc. GDAL

API Layers

HP Data Library Layer

NetCDF4 Climate, Weather

NetCDF4 Ocean Bathy

ASDF HDF5

PH5 HDF5

NetCDF4 EO

HDFEOS

[Airborne [SEG-Y] Geophysics]

[FITS]

[LAS LiDAR]

HDF5

Other Legacy formats

Lustre

Object Storagenci.org.au

Enabling transparency, reproducibility, informatics & deep learning techniques

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Climate and Weather Science Laboratory The CWSlab provides an integrated national facility for research in climate and weather simulation and analysis. • To reduce the technical barriers to using state of the art tools; • To facilitate the sharing of experiments, data and results; • To reduce the time to conduct scientific research studies; and • To elevate the collaboration and contributions to the development of the Australian Community Climate Earth-System Simulator (ACCESS)

ACCESS Modelling © National Computational Infrastructure 2016

Data Services

Computational Infrastructure

Climate Analysis

Ben Evans, eResearch Conference, Oct 2016

nci.org.au

Working in the era of Exascale • •

• •

Key Messages for raising a Data Centre in a Big Data World Scientific Computing scales of today have to be built across collaborations of national priorities and national institutions that need to scale up and scale-down Data needs to be born-connected, Transdisciplinary Data: interoperable international standards for data collections are critical for allowing complex interactions in HP environments both within and between HPD collections at are applied at birth Needs expertise around usability and performance tuning to ensure getting the most out of the data.

Collaborative efforts across disciplines and collaboration across nations

© National Computational Infrastructure 2016

Ben Evans, eResearch Conference, Oct 2016

nci.org.au