Author

Jinhui Qin

Date of Award

2008

Degree Type

Thesis

Degree Name

Doctor of Philosophy

Program

Computer Science

Supervisor

Dr. Michael Bauer

Abstract

To more effectively use a network of high performance computing clusters, allocating multi-process jobs across multiple connected clusters becomes an attractive possibility. This allocation process entails dividing the processes of a job among several clusters which we refer to as co-allocation. Co-allocation offers the possibility of more efficient use of computer resources, reduced turn-around time and computations using numbers of processors larger than processors on any single cluster. In order to realize these possibilities, effective co-allocation, ultimately, depends on the inter-cluster communication cost. In this thesis, we introduce a scalable co-allocation strategy called theMaximumBandwidthAdjacentclusterSet(MBAS)strategy. Thestrategymakesuse of two thresholds to control allocation: one to control the bandwidth levels on inter­ cluster communication links and another to control how jobs are split. To evaluate the performance of the proposed strategy, a simulator that can simulate the dynamic behavior of jobs running across multiple clusters has also been developed and validated in this research. The simulation results indicate that by adjusting the thresholds for link saturation level control and chunk size control in splitting jobs, the MBAS co-allocation strategy can significantly improve both users’ satisfaction and system utilization. However, the situation is more complicated in reality as the mix of communication patterns can vary. Being able to dynamically adjust the thresholds may provide a more

effective approach to co-allocation. In the thesis we introduce the Adaptive Threshold Control System (ATCS). Based on fuzzy logic, ATCS can adjust the thresholds dynamically according to system states and jobs’ characteristics. The simulation results suggest that using ATCS during MBAS job co-allocation the overall performance can be improved further than by just using static thresholds. Moreover, this improvement is much more tolerant to the changes ofjob communication requirements; while this is a problemforusingstaticthresholds. Inaddition,ATCSprovidestheflexibilitytoenablea system to be tuned to achieve a more expressive co-allocation control in practice.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.