Scheduling Large-Scale Scientific Workflow on Virtual Machines with Different Numbers of vCPUs
This document is the results of the research project funded by Natural Science Foundation of Liaoning (No. 2019-MS-170), Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis Open Fund (HCIC201605), Guangxi Youth Teacher Project (2018KY0976) and Supercomputing Center of Dalian University of Technology.
The Journal of Supercomputing
With the wide deployment of cloud computing in scientific computing, cost minimization is increasingly critical for large-scale scientific workflow. Unfortunately, due to the highly intricate directed acyclic graph (DAG)-based workflow and the flexible usage of virtual machines (VMs) in cloud platform, the existing workflow scheduling approaches are inefficient to strike a balance between the parallelism and the topology of the DAG-based workflow while using the VMs, which causes a low utilization of VMs and consumes more cost. To address these issues, this paper presents a novel task scheduling framework named cost minimization approach with the DAG splitting method (COMSE) for minimizing the cost of running a deadline-constrained large-scale scientific workflow. First, we provide comprehensive theoretical analyses on how to improve the utilization of a resource-balanced multi-vCPU VM for running multiple tasks simultaneously. Second, considering the balance between the parallelism and the topology of a workflow, we simplify the DAG-based workflow, and based on the simplified DAG, a DAG splitting method is devised to preprocess the workflow. Third, since the cloud is charged by hours, we also design an exact algorithm to find the optimal operation pattern for a given schedule to make the consumed instance hours minimum, and this algorithm is named as instance hours minimization by Dijkstra (TOID). Finally, by employing the DAG splitting method and the TOID, the COMSE schedules a deadline-constrained large-scale scientific workflow on the multi-vCPU VMs and incorporates two important objects: minimizing the computation cost and the communication cost. Our solution approach is evaluated through rigorous performance evaluation study using real-word workflows, and the results show that the proposed COMSE approach outperforms existing algorithms in terms of computation cost and communication cost.
Locate the Document
Wu, H., Chen, X., Song, X., Zhang, C., & Guo, H. (2020). Scheduling large-scale scientific workflow on virtual machines with different numbers of vCPUs. The Journal of Supercomputing, 1-32.