Portland State University. Department of Computer Science
Karen L. Karavanic
Term of Graduation
Date of Publication
Master of Science (M.S.) in Computer Science
1 online resource (vii, 90 pages)
Optimizing scientific application performance in HPC environments is a complicated task which has motivated the development of many performance analysis tools over the past decades. These tools were designed to analyze the performance of a single parallel code using common approaches such as message passing (MPI), multithreading (OpenMP), acceleration (CUDA), or a hybrid approach. However, current trends in HPC such as the push to exascale, convergence with Big Data, and growing complexity of HPC applications and scientific workflows, have created gaps that these performance tools do not cover, particularly involving end-to-end data movement through an end-to-end HPC workflow comprising multiple codes, paradigms, or platforms.
To address this performance monitoring gap, we define a new metric called Workflow Critical Path (WCP), a data-oriented critical path metric for Holistic HPC Workflows. Using cloud-based technologies, we implement a prototype called Crux, a distributed analysis tool for calculating and visualizing WCP. Crux takes a novel, data-oriented approach by constructing program activity graphs (PAGs) using data states as vertices and data mutations as edges. Our experiments with a workflow simulator on Amazon Web Services show Crux is scalable and capable of calculating WCP for common Holistic HCP workflow patterns. We discuss how Crux and WCP could be used with production HPC applications.
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/ This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Nguyen, Daniel D., "Workflow Critical Path: A Data-Oriented Critical Path Metric for Holistic HPC Workflows" (2020). Dissertations and Theses. Paper 5495.