Sponsor
Portland State University. Department of Computer Science
First Advisor
Su-Hui Chiang
Term of Graduation
Fall 2007
Date of Publication
12-4-2007
Document Type
Dissertation
Degree Name
Doctor of Philosophy (Ph.D.) in Computer Science
Department
Computer Science
Language
English
Subjects
Parallel processing (Electronic computers)
DOI
10.15760/etd.7938
Physical Description
1 online resource (2, ix, 140 pages)
Abstract
System administrators for parallel computers face many difficulties when managing job scheduling systems. First, current production job schedulers use many parameters, which seem flexible but it is highly challenging to configure and tune these parameters. Second, fair share is an important scheduling goal, but it is not clear what kind of fair share can be expected under current schedulers and how fair share impacts scheduling performance. Third, several job runtime prediction methods were proposed to improve inaccurate user-estimated runtimes, but these methods could under-estimate runtimes by a large amount and it is not clear whether they are practical for use on real systems. To address these issues, we study existing scheduling policies and design new policies. We evaluate policy performance by event-driven simulation, using real job traces.
To simplify the system administration task, we propose a new scheduling framework, which allows the system administrators to specify only high-level objectives, while the scheduler automatically decides the schedules according to the given objectives and adapts to workload changes. We investigate several design and implementation choices of the goal-oriented policies. We show that by optimizing performance for objectives, goal-oriented policies have the potential to considerably improve the performance.
To provide a better understanding of fair share policies supported by current production schedulers and their impact on scheduling performance, we evaluate two classes of fair share policies using a wide range of performance measures and several fair share measures proposed in this thesis. Our evaluation results show that fair share indeed reduces heavy-demand users from dominating system resources. However, our detailed per-user performance results show that some types of users may suffer unfairness under fair share, possibly due to priority mechanisms used by the current schedulers.
As for runtime predictions, we find that using previous methods results in poor performance and unfairness problems, because of under-estimated runtimes induced by predictions. To reduce the problems, we investigate several alternative methods, including inflated each initial prediction by half of the requested runtime and two-class runtime estimates. We find that these alternative methods can outperform previous methods in most cases.
Rights
In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/
Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
Persistent Identifier
https://archives.pdx.edu/ds/psu/38127
Recommended Citation
Vasupongayya, Sangsuree, "Goal-Oriented Job Scheduling for Parallel Computer Systems" (2007). Dissertations and Theses. Paper 6068.
https://doi.org/10.15760/etd.7938
Comments
If you are the rightful copyright holder of this dissertation or thesis and wish to have it removed from the Open Access Collection, please submit a request to pdxscholar@pdx.edu and include clear identification of the work, preferably with URL.