D. Lezzi, BSC (ES)
Abstract: Today, developers lack tools that enable the development of complex workflows that include HPC simulations and modeling with data analytics (DA) and machine learning (ML), in a unique programming framework. Current environments are in fact fragmented in multiple components, using different programming models, with different processes for computing and data management. This talk presented the recent activities of the Workflows and Distributed Computing group at BSC to develop a workflow software stack and an additional set of services to enable the convergence of HPC and big data analytics (HPDA) and machine learning in scientific and industrial applications. This framework allows the development of innovative, adaptive and dynamic workflows that efficiently make use of the heterogeneous computing resources and also leverage innovative storage solutions. An application of the framework for a biomolecular dynamics use case was also presented, to describe how the tools allow developers to run huge executions making an efficient use of large supercomputers such as the ones that are to be available in the coming years and thus reaching the pre-exascale era.