Colocation of HPD and HPC is now vital ... NCI redesign of I/O server to use an MPI-IO approach. ... 2 OpenMP threads, 2x IO server groups and 9x IO servers.
NCI enabling Earth Systems Science using High Performance Compu;ng & Data Dr Ben Evans Associate Director, NCI Research Engagement and Ini9a9ves
nci.org.au @NCInews nci.org.au
Co-loca9ng HPC and Data Collec9ons High-Performance Data (HPD) (Evans, ISESS 2015, Springer) Next NCI
Current NCI
HPC – turning compute into IO-bound problems HPD – turning IO-bound into ontology + semantic problems Hybrid HPC systems with HPD and “Big Data” Technologies
http://www.top500.org/statistics/perfdevel/
© National Computational Infrastructure 2015
Colocation of HPD and HPC is now vital for the future of deep discovery nci.org.au
NCI’s integrated high-performance infrastructure Internet
To Huxley DC
NCI data movers
Cloud
Raijin Login + Data movers
Raijin HPC Compute
10 GigE /g/data 56Gb FDR IB Fabric Raijin 56Gb FDR IB Fabric
Massdata (tape)
Cache 1.0PB, Tape 20 PB
Persistent global parallel filesystem
/g/data1
/g/data2
~7.4 PB
~6.75 PB
© National Computational Infrastructure 2015
Raijin high-speed filesystem
/g/data3 ~9 PB
/short 7.6PB
/home, / system, /images, / apps
nci.org.au
NCI Access Op9misa9on Project
Fujitsu – NCI Collabora9on Agreement The partnership is 3 years supports two projects: A) Op9misa9ons of the Australian Community Climate and Earth System Simulator (ACCESS) model; (Yr1-3) B) advanced computa9onal scaling tools and methods (yr 2-3) + non-ACCESS codes + next genera9on hardware NCI: Evans, Cheeseman, Roberts, Ward, Yang BoM: Bowen, Pugh, Bermous, Freeman, Naughton, Wedd, CSIRO: Dix, Yan Fujitsu: Nobes, T. Yamada
© National Computational Infrastructure 2015
nci.org.au
Domain
Yr1
Yr2
Yr3
Atmosphere
UM 8.2-4 (APS2) Global 25kmL70 Regional 12kmL70 City 1.5km
UM10.x (PS36) Global N768L70 25kmL85 17kmL85 Regional 5kmL85
33 year sim (1979-) Global 60kmL85 25kmL85 17kmL85
Data assimila9on
4D-VARv30 N216L70 N320L70
BODAS, PECDAS, enKF-C
…
Ocean
MOM5 (inc. OFAM3) MOM5 + CICE 1/10 degree 1/10 degree 1/4 degree 1/4 degree
MOM+CICE 33 year sim (1979-)
Coupled System
ACCESS-CM UM8.3+MOM +OASIS-MCT Global N96L85 1/4 ocean.
ACCESS-CM +UKCA and carbon cycle
ACCESS-CM2
Seasonal GC2
GC3 nci.org.au
Domain
Yr2
Wave
WaveWatch3
ROMS
ROMS
LAND
CABLE - stand-alone? - DA? - UKMO Coupled –Jules interface
ACCESS-TC
??
Yr3
Evaluate: - Software stack configuration: OpenMPI vs Intel MPI - Intel Compilers - I/O Also: - Record configurations with Rose/Cylc - Upgrade all ACCESS model codes (or Raijin) to latest release: UM 10.X And: - Joint leadership on UKMO collab profiling and future optimisations (Evans+Selwood) nci.org.au
UKMO PS36 UM10.x N768L70 on NCI • • • • • •
Latest release of Atmosphere (EndGame Dynamics) Intel 15.0.3.187 compilers Intel MPI 5.0.2.044, DAPL 2.1.5 vs OpenMPI 1.8.5+ Careful code profiling and domain decomposi9ons Review of I/O Various lijle tricks – MPI tuning – Hyperthreading and affinity selngs – Pre-processor branch patch reported from an IBM version
nci.org.au
Outcomes: • Drama9cally improved scalability for an APS3 an9cipated benchmark case for a 10 day forecast job, using a baseline resource configura9on of 3552 cores • Performance gain of 40% • NCI redesign of I/O server to use an MPI-IO approach. Accepted by UKMO as bejer approach. An9cipate general release by UKMO in UM10.4.
nci.org.au
APS2-G on Bureau • UM8.2 N512L70 (New Dynamics): • Upgraded to benchmark sonware stack Intel12+OpenMPI 1.6.5 -> Intel 14.0.1.106+OpenMPI 1.7.4 • Horizontal model decomposi9on of compute (28x36)=1008 cores • The UM asynchronous I/O feature was enabled; • OpenMP was enabled on the main program file, io_services library (including C98_1A pre-processor selng); • Using par9ally commijed nodes. Resulted in approx 40% improvement in performance nci.org.au
APS2-R regional 12 km forecast on Bureau • UM8.4 (New Dynamics) • Opera9onal constraint of 1200 cores • Horizontal model decomposi9on of compute core and UM asynchronous I/O feature was enabled; - 2 OpenMP threads, 2x IO server groups and 9x IO servers. - Horizontal decomposi9on of 16x36. • Intel Hyperthreading enabled • Improved handling of 2 Lateral Boundary Condi9ons, allowing nested regional models to run earlier. • And similar changes for APS2-C nci.org.au
4DVar v30.0.0 N216L70 and N320L70 • Evaluate performance for known runs
Outcomes • N216 – 30%-50% improved performance and memory b/w – Good scaling to 384 MPI tasks with appropriate decomposi9on for X vs Y axis for N216
• N320 – 100% improved performance. Scaling to 3072 cores.
• Hyperthreading enabled and under commit nodes • OpenMP broken, Poor cache use, MPI collec9ves poor • Wait for new 4DVar release... nci.org.au
MOM5-SIS1 0.25 degree changes and outcomes • Flux change to use MPI Alloall collec9ves • Now runs on both OpenMPI and Intel MPI • Evaluate MOM vs SIS concurrency performance gains – eg 3840 MOM + 640 SIS
Outcomes • MOM-SIS coupled model at 25 years per day at a high efficiency, and 45 years per day at an acceptable level • 3x performance improvement for NCI • GFDL adopted our changes. – 12,228 core job on Gaea has 3x performance improvement nci.org.au
MOM_SIS 0.1 Global • Iden9fy MPI communica9ons issues • Remove redundant code for of water and tracer fluxes into land cells • Fix code crashes Outcome • Now scaling to at least 20,000 cores of Raijin – Raijin too small to make this rou9nely effec9ve – 10,000 cores are effec9ve nci.org.au
Coupled Climate CM2 Configura9ons evaluated: – A96 - high-resolu9on (0.25°) ocean and sea ice models with a low resolu9on (N96) atmosphere. – A216: the same configura9on, with a high-resolu9on (N216) atmosphere.
Outcomes • 3x performance improvement for ACCESS CMIP5 runs to 6.5 years/day • Improved memory u9lisa9on. Fixed several bugs->crashes • Address CICE bojleneck. 2x CICE perf. Improvement • Further UM bojlenecks iden9fied in A216, but not yet deeply explored. nci.org.au
Seasonal Climate GC2 Implemented comprehensive Rose/Cylc suite Intel MPI vs OpenMPI automated NCI profiling enabled Enable easy decomposi9on modifica9on rec9fy a buffer overflow OASIS3 coupler library change to allow the UM to run in threaded mode. • I/O server now enabled • Address balance of cores: • • • • • •
– UM (1 task for 2 cores) + OASIS3+NEMO (1 task per core)
nci.org.au
GC2 Model Scaling - 5 model day run 9 8 7
Speedup
6 5 4 3 2 1 0 0
500
1000
1500
2000
2500
3000
Number of cores Actual Scaling
Ideal Scaling
- Scaling improved from 320 cores to 848 cores with reasonable efficiency. - Code does now scale, but question of efficiency.
nci.org.au
NCI National Environment Research Data Collections 1. Climate/ESS Model Assets and Data Products 2. Earth and Marine Observa9ons and Data Products 3. Geoscience Collec9ons 4. Terrestrial Ecosystems Collec9ons 5. Water Management and Hydrology Collec9ons Data Collec9ons Approx. Capacity CMIP5, CORDEX
~3 Pbytes
ACCESS products
2.4 Pbytes
LANDSAT, MODIS, VIIRS, AVHRR, INSAR, MERIS
1.5 Pbytes
Digital Eleva9on, Bathymetry, Onshore Geophysics
700 Tbytes
Seasonal Climate
700 Tbytes
Bureau of Meteorology Observa9ons
350 Tbytes
Bureau of Meteorology Ocean-Marine
350 Tbytes
Terrestrial Ecosystem
290 Tbytes
Reanalysis products
100 Tbytes
© National Computational Infrastructure 2015
nci.org.au
NERDIP: enabling mul9ple ways to interact with data Tools, Virtual Laboratories (VL’s), Portals Open Nav Surface
Globe Claritas
AGDC VL
Climate & Weather Systems Lab
Digital Bathymetry & Eleva;on Portal
Data. gov.au
All Sky eReefs Virtual Observatory
Biodiversity & Climate Change VL
eMAST Speddexes
Fortran, Models C, C++, MPI, Python, R, OpenMP MatLab, IDL VGL
VHIRL
Visualisa;on Drish; Voluminous ANDS RDA Portal
AODN/ IMOS Portal
Ferret, CDO,NCL, NCO, GDL,GDAL, GrADS, GRASS, QGIS TERN Portal
AuScope Portal
NCI Na=onal Environmental Research Data Interoperability PlaEorm
NetCDF-4 EO
Libgdal EO
HDF5 MPI-enabled Lustre
Fast “whole-of-library” catalogue
Direct Access
HDF-EOS
RDF, LD
OGC SOS
OGC WPS
HP Data Library Layer 2
NetCDF-4 Climate/Weather/ Ocean
OGC WCS
Data Library Layer 1
netCDF-CF
OGC WFS
Metadata Layer
OGC WMS
(expose data model & seman=cs)
OpenDAP
Services Layer
ISO 19115, RIF-CS, DCAT, etc.
[FITS]
Airborne Geophysics Line data
[SEG-Y]
BAG
LAS LiDAR
HDF5 Serial nci.org.au Other Storage (options)
21/58
Australian Government
Portals and Access
Research & Development
Government Operational
Standards – Ensure compliant with NEII
AGIMO
Gov 2.0
CSSDP
NAMF
NSS
AGLS
Australian Govt Water ACT
Bureau of Met
NSW
QLD
VIC
WA
SA
NT
TAS
BOM
CSIRO
NWC
MDBC
NT
NSW
QLD
NT
NSW
GA
QLD
VIC
SA
TAS
WA
NZ
SA
TAS
ACT
CSIRO
WA
OSDM
ICSM
ACT
VIC
ISO/OGC
ISO/OGC Aust. Govt. Online Service Point
Govt Geoscience Info. Committee (GGIC)
ANZLIC Spatial Information Council
Aust Water Resources Information System
Australian Spatial Data Directory
NT
Aust. Ocean Data Centre Joint Facility (AODCJF)
NSW
QLD
Dept. of Defence
AAD
VIC
SA
TAS
CSIRO MAR
BOM
GA
NZ
WA
ISO/OGC
Geoscience Australia
AIMS
ISO/OGC
Geoscience Portal Australian Ocean Data Network
Australian Research Data Commons
ISO/OGC Data Mangement
Atlas of Living Australia
TERN.
Climate & Weather
ISO NCRIS Integrated Biological Systems
NCRIS TERN
Data Management Components • ANDS • NCI • RDSI
Data Integration • Atlas of Living Australia e-MAST • Aust Phenomics BCCVL Network
Other Components • AAF • AARNet
Data Generation Aust. Plant Phenomics Facility
© National Computational Infrastructure 2015
NCRIS CWSLab
AuScope Portal
ISO/OGC ISO/OGC CRC for Spatial NCRIS AuScope Information • Australian Spatial Consortium • ASIBA • SSI • PSMA • 43 Pty Ltd
• Data Integration • AuScope Grid • SISS • ARSDC Data Generation • VCL • Geospatiall • SAM • Earth Imaging • Earth Composition • Groundwater
ISO/OGC NCRIS IMOS Data Integration • eMII • MACDDAP Data Generation • ARGO • SOOP • SOTS • ANFOG • AUV • ANMN • AATAMS • FAIMMS • SRS
nci.org.au
Earth System Grid Federation: Exemplar of an International Collaboratory for large scientific data and analysis
© National Computational Infrastructure 2015
nci.org.au
Eg. Australian Geophysics Data Collec9on
Australian Geophysics Data Collec9on
Collection National Grids
Gravity
Mag
Radiometrics
Theme
Gravity
Mag
Radiometrics
AEM
Seismic
MT
Seis- mology
Survey
Survey 1
Survey 2
Survey 3
Survey 4
Survey 5
Survey 6
Survey n
Grids
20 m
30 m
40 m
80 m
10 m
50 m
10 m
50 m
20 m
Lines Points © National Computational Infrastructure 2015
nci.org.au
40 m
Magne9cs map of Australia, 2015
Variable Reduction to the Pole (produced using GA codes at the NCI) 80 x 80 m resolution Courtesy of A. Nakamura and P. Milligan
nci.org.au
Inversion Model – Gawler Craton South Australia Magnetic Inversion
N
1500 km x 1700 km (4 km cell size) ~ 8 Million cells
250 km
Inversion result took 9 hours to run using 128 CPUS at the NCI
Horizontal Section Z = -10000 mRL 0
0.005
0.01
0.015
0.02
0.025
0.03
Magnetic Susceptibility (SI) 0.035
0.04
0.045
0.05
0.055
Courtesy of the Geological Survey of South Australia and GA
0.06
0.065
0.07
0.075
0.08
nci.org.au
Oil and Gas examples driving HPC c/- Peter_Breunig, Chevron 1. seismic data processing Current Imaging/Modeling drives compute cycles – 2002 – 1000 gflops/s – Kirchoff Migra9on – 2004 – 10,000 gflops/s – wave equa9on migra9on – 2010 – 150,000 gflops/s – reverse 9me migra9on – 2014 – 1,500,000 gflops/s – acous9c full wavefield inversion (1.5 pflop/s) Future: – 3D Elas9c Anisotropic Modeling and Reverse Time Migra9on & Imaging with Mul9ples – 3D Full Wavefield (constrained) Inversion - normal, elas9c 5x, visco- elas9c 50x.... – Itera9ve Wavefield Modeling for Stochas9c Inversion 2. Sensor integra9on There is a long-term unsa9sfied desire to model integrated facili9es and reservoirs in near real-9me, leveraging those sensors
© National Computational Infrastructure 2015
nci.org.au
3. reservoir simula9on Improved resolu9on within the reservoir is cri9cal because: • Deepwater wells are costly, • Fully exploi9ng exis9ng assets is essen9al.
c/- Majdi Baddourah, Saudi Aramco
Move from current structured models up to a billion Cells to unstructured grids of multi-billion cells
© National Computational Infrastructure 2015
Over 30,000 cores
Specialised visualisation
nci.org.au