First Advisor

Su-Hui Chiang

Term of Graduation

Fall 2007

Date of Publication

12-4-2007

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science

Department

Computer Science

Language

English

Subjects

Parallel processing (Electronic computers)

DOI

10.15760/etd.7938

Physical Description

1 online resource (2, ix, 140 pages)

Abstract

System administrators for parallel computers face many difficulties when managing job scheduling systems. First, current production job schedulers use many parameters, which seem flexible but it is highly challenging to configure and tune these parameters. Second, fair share is an important scheduling goal, but it is not clear what kind of fair share can be expected under current schedulers and how fair share impacts scheduling performance. Third, several job runtime prediction methods were proposed to improve inaccurate user-estimated runtimes, but these methods could under-estimate runtimes by a large amount and it is not clear whether they are practical for use on real systems. To address these issues, we study existing scheduling policies and design new policies. We evaluate policy performance by event-driven simulation, using real job traces.

To simplify the system administration task, we propose a new scheduling framework, which allows the system administrators to specify only high-level objectives, while the scheduler automatically decides the schedules according to the given objectives and adapts to workload changes. We investigate several design and implementation choices of the goal-oriented policies. We show that by optimizing performance for objectives, goal-oriented policies have the potential to considerably improve the performance.

To provide a better understanding of fair share policies supported by current production schedulers and their impact on scheduling performance, we evaluate two classes of fair share policies using a wide range of performance measures and several fair share measures proposed in this thesis. Our evaluation results show that fair share indeed reduces heavy-demand users from dominating system resources. However, our detailed per-user performance results show that some types of users may suffer unfairness under fair share, possibly due to priority mechanisms used by the current schedulers.

As for runtime predictions, we find that using previous methods results in poor performance and unfairness problems, because of under-estimated runtimes induced by predictions. To reduce the problems, we investigate several alternative methods, including inflated each initial prediction by half of the requested runtime and two-class runtime estimates. We find that these alternative methods can outperform previous methods in most cases.

Rights

In Copyright. URI: http://rightsstatements.org/vocab/InC/1.0/

Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Comments

If you are the rightful copyright holder of this dissertation or thesis and wish to have it removed from the Open Access Collection, please submit a request to pdxscholar@pdx.edu and include clear identification of the work, preferably with URL.

Persistent Identifier

https://archives.pdx.edu/ds/psu/38127

Share

COinS