The explosion in ESM workflow technology that has characterized the latter 20th and early 21st centuries comes with a commensurate explosion in complexity and potential points of failure. Nevertheless, the modern practice of earth system science needs 24x7x365 run time for the models and transformation of model output into products used by scientists in their day-to-day research. To meet this need, the GFDL and other research Labs the world over have pursued ever increasing levels of robust, resilient workflow automation. But even with decades of work, there is much to do and more complexity on the way.

The lab’s pursuit of increasing model resolutions will drive vast increases in data volumes. Already, some of our models with high resolution and high diagnostic output can choke the post-processing workflow. In the near future, new levels of storage hierarchy such as non-volatile memory and flash-based file systems offer the potential for post-processing in-flight. But all of this comes at the cost of even more complexity and potential points of failure.

To meet the demands of data processing in the exascale era, the GFDL is exploring both hardware and software technologies. This talk will focus on the technology and our initial experiences with a flash-based file system by Vast Data. Further, we will present some initial results from a software infrastructure that gathers detailed job performance data into a database thus supporting per job, per experiment as well as cluster-wide modern analytics on the data.