HPC, Big Data, and AI Convergence Towards Exascale: Challenge and Vision [1 ed.] 9781032009841, 9781032009919, 9781003176664, 1032009845

HPC, Big Data, AI Convergence Towards Exascale provides an updated vision on the most advanced computing, storage, and i

100 50 31MB

English Pages 294 [323] Year 2022

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Cover
Half Title
Title Page
Copyright Page
Table of Contents
Foreword
Foreword
Preface
Preface
Acknowledgments
Editors
Contributors
1 Toward the Convergence of High-Performance Computing, Cloud, and Big Data Domains
1.1 Introduction
1.1.1 History of Cloud Computing
1.1.2 History of HPC
1.1.3 Evolution of Big Data
1.1.4 Evolution of Big Data Storage and Tools
1.2 Exploiting Convergence
1.2.1 CYBELE Project
1.2.2 DeepHealth Project
1.2.3 EVOLVE Project
1.2.4 LEXIS Project
Acknowledgment
References
2 The LEXIS Platform for Distributed Workflow Execution and Data Management
2.1 Motivation
2.2 Architecture (Codesign) and Interfaces
2.3 Security
2.4 Accounting and Billing
2.5 Easy Access to HPC/Cloud Through a Specialized Web Portal
2.6 Market Analysis
2.6.1 LEXIS Project Impact
Acknowledgment
References
3 Enabling the HPC and Artificial Intelligence Cross-Stack Convergence at the Exascale Level
3.1 Introduction
3.2 The Rise of Convergent Infrastructures
3.3 The ACROSS Approach to the HPC, Big Data, and AI Convergence
3.3.1 Heterogeneous Infrastructural Support
3.3.2 The Management of the Convergent Platform
3.4 Related Works
3.5 Conclusions
Acknowledgment
Notes
Bibliography
4 Data System and Data Management in a Federation of HPC/Cloud Centers
4.1 Introduction: Data Federation of European HPC/Cloud Centers
4.2 Requirements On the LEXIS DDI
4.2.1 Unified Data Access
4.2.2 Usage and Federation of Diverse Data Backend Systems
4.2.3 Reliability and Redundancy
4.2.4 AAI Support
4.2.5 APIs
4.2.6 State-Of-The-Art Research Data Management
4.3 Federation Via a DDI Based On IRODS
4.3.1 Relevant Basic Properties of IRODS
4.3.2 IRODS HA Setup
4.3.3 IRODS Zones Federation Across Centers and Data Movement
4.3.4 Storage Tiering and Underlying Data Storage
4.3.5 Logical Structure of the DDI
4.4 Hardware
4.4.1 Storage Systems for HPC and Infrastructure-As-A-Service- Cloud Clusters
4.4.2 Storage Systems Dedicated to LEXIS
4.4.3 HPC–Cloud-Storage Interconnect and Data Node/Burst Buffer Concept
4.4.3.1 SBF (Smart Bunch of Flash)
4.4.3.2 SBB
4.5 Unified Access to the Platform Based On an AAI
4.5.1 LEXIS Identity and Access Management (IAM) Solution, SSO, and AAI
4.5.2 Platform Services Vs. AAI: Separation of Concerns
4.5.3 LEXIS DDI and IAM/AAI System
4.6 Data Management Via APIs
4.6.1 Data Search, Upload, and Download APIs
4.6.2 Staging API
4.6.3 Replication and PID Assignment API
4.6.4 Helper APIs
4.6.5 Compression/Decompression/Encryption/Decryption API
4.7 Integration With EUDAT Services
4.7.1 EUDAT B2HANDLE
4.7.2 EUDAT B2SAFE
4.7.3 EUDAT B2STAGE
4.8 Conclusion
Acknowledgment
References
5 Distributed HPC Resources Orchestration for Supporting Large-Scale Workflow Execution
5.1 Introduction
5.2 Federated Execution Platforms
5.3 WMSs and Implementation in LEXIS
5.3.1 Dynamic Workflow Orchestration
5.3.2 Resource Management Metrics
5.4 Workflow Data Management
5.5 LEXIS Pilot Use Cases and Orchestration
5.6 Related Works
5.7 Conclusion
Acknowledgment
Notes
Bibliography
6 Advanced Engineering Platform Supporting CFD Simulations of Aeronautical Engine Critical Parts
6.1 Introduction: Background and LEXIS Aeronautics Pilot
6.2 Engineering Case Studies in the LEXIS Aeronautics Pilot
6.3 The Turbomachinery Case Study
6.3.1 Engineering Context
6.3.2 Digital Technology Deployment
6.3.2.1 Application Workflow
6.3.2.2 Main Application Software and HW Resources
6.3.3 First Results
6.3.4 Benefit–Cost Analysis of HW Acceleration
6.3.5 Next Steps
6.4 The Rotating Parts Case Study
6.4.1 Engineering Context
6.4.2 Digital Technology Deployment
6.4.2.1 Application Workflow
6.4.2.2 Main Application Software and HW Resources
6.4.3 First Results
6.4.3.1 SPH Liquid-Phase Simulation
6.4.3.2 SPH Gas-Phase Simulation
6.4.4 Next Steps
6.5 Final Remarks
Acknowledgment
Notes
References
7 Event-Driven, Time-Constrained Workflows: An Earthquake and Tsunami Pilot
7.1 Introduction
7.2 Event-Driven, Time-Constrained Workflows
7.2.1 Requirements
7.2.2 Background
7.2.3 Overall View of the Workflow
7.3 Workflow Components
7.3.1 Shakemap and Exposure Dataset
7.3.2 Tsunami Simulations
7.3.3 SEM
7.4 Technological Layers
7.4.1 Technology Layer 1: Orchestration
7.4.2 Technology Layer 2: Heterogeneous Compute
7.4.3 Technology Layer 3: Data
7.5 Conclusion
Note
References
8 Exploitation of Multiple Model Layers Within LEXIS Weather and Climate Pilot: An HPC-Based Approach
8.1 Introduction: Background and Driving Forces
8.2 The Weather and Climate Pilot
8.3 Observational Data
8.4 LEXIS DDI and Weather and Climate Data API
8.5 LEXIS Orchestration System
8.6 Weather and Climate Pilot Workflows
8.6.1 WRF–ERDS Workflow Examples
8.7 Conclusion
Acknowledgment
References
9 Data Convergence for High-Performance Cloud
9.1 Introduction
9.2 Motivations
9.3 Design and Implementation
9.4 Karvdash
9.5 DataShim
9.5.1 Overview
9.5.2 Dataset Custom Resource Definition
9.5.3 DatasetInternal Custom Resource Definition
9.5.4 DataShim Operator and Admission Controller
9.5.5 Caching Plugin
9.5.6 Objects Caching On CEPH
9.5.7 Ceph-Based Caching Plugin Implementation
9.5.8 Evaluation of the Ceph-Based Caching Plugin
9.6 H3
9.6.1 Overview
9.6.2 Data and Metadata Organization
9.6.3 The H3 Ecosystem
9.7 Integration
9.8 Related Work
9.9 Conclusions
Note
References
10 The DeepHealth HPC Infrastructure: Leveraging Heterogenous HPC and Cloud-Computing Infrastructures for IA-Based Medical Solutions
10.1 Introduction
10.2 The Parallel Execution of EDDL Operations
10.2.1 COMPSs
10.2.2 StreamFlow
10.3 Cloud Infrastructures
10.3.1 Hybrid Cloud
10.3.2 Parallel Execution On Cloud Environments
10.3.2.1 Parallel Cloud Execution Based On COMPSs
10.3.2.2 Parallel Cloud Execution Based On StreamFlow
10.4 Acceleration Devices: GPU and FPGAs
10.4.1 FPGA Acceleration
10.4.1.1 The DeepHealth FPGA Infrastructure
10.4.1.2 An Optimized FPGA Board Design for DL
10.4.1.3 FPGA-Based Algorithms
10.4.2 Many Core and GPU Acceleration
10.5 Conclusions
Notes
11 Applications of AI and HPC in the Health Domain
11.1 Introduction
11.2 AI and HPC in the Health Domain in 2020
11.3 DeepHealth Concept
11.4 DeepHealth Use Cases
11.5 Use of HPC and Cloud in Medical Pilots
11.5.1 UC2 – UNITOPatho
11.5.2 UC3 – UNITOBrain
11.5.3 UC4 – Chest
11.5.4 UC5 – UNITO Deep Image Annotation
11.5.5 UC12 – Skin Cancer Melanoma Detection
11.6 DeepHealth Value Proposition
11.7 Conclusions
Notes
12 CYBELE: On the Convergence of HPC, Big Data Services, and AI Technologies
12.1 Introduction: Background and Driving Forces
12.2 Identified Gaps: Motivating the CYBELE Vision
12.3 Materializing the Solution: Convergence of HPC, Big Data, and AI
12.3.1 Data and Infrastructure Access Security Layer
12.3.2 Embedded Experiments Composition Layer
12.3.3 Parallel and Distributed Execution Management Layer
12.3.4 Data Services Layer
12.3.5 Visualization and Reporting Layer
12.4 Key Takeaways and Conclusions
Note
References
13 CYBELE: A Hybrid Architecture of HPC and Big Data for AI Applications in Agriculture
13.1 Introduction: Vision and Challenges
13.2 Background
13.2.1 AI in Big Data Analytics On Cloud
13.2.2 AI On HPC Systems
13.3 Hybrid Big Data and HPC Resource for AI Applications in CYBELE
13.4 Parallelization and Deployment of AI Applications On HPC Systems
13.4.1 Pilot Soybean Farming
13.4.1.1 Pilot Description
13.4.1.2 Application Parallelization for HPC Systems
13.4.2 Pilot Wheat Ear
13.4.2.1 Pilot Description
13.4.2.2 Application Parallelization for HPC Systems
13.5 Performance Evaluation for Pilot Soybean Farming and Pilot Wheat Ear
13.6 Discussion
13.7 Conclusion Remarks and Future Works
Acknowledgments
Notes
References
14 European Processor Initiative: Europe’s Approach to Exascale Computing
14.1 Introduction
14.2 European Processor Initiative
14.2.1 Global Technical Panstream
14.2.2 GPP Stream
14.2.3 Accelerator Stream
14.2.4 Automotive Stream
14.3 Conclusion
Acknowledgment
This Work Has Received Funding From the European Union’s Horizon 2020 Research and Innovation Programme “European Processor Initiative (EPI)” Under Grant Agreement No 826647.
Bibliography
Index

HPC, Big Data, and AI Convergence Towards Exascale: Challenge and Vision [1 ed.]
 9781032009841, 9781032009919, 9781003176664, 1032009845

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Recommend Papers