An Auction Mechanism For Grid Scheduling And Resource Allocation In The Context Of Atlas

Date

2009-09-16T18:19:31Z

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Computer Science & Engineering

Abstract

The technological advancements in the areas of computing and networking over the recent years have led to an emerging infrastructure known as Computational Grids, which provides users with the flexibility of pervasive access to enormous computational resources hosted at remote locations. Effective resource management and job scheduling poses a challenge when constraints such as resource utilization, response time, global and local policies need to be taken into account, while dealing with potentially independent sources of jobs, computational, and storage resources. It must be ensured that scheduling decisions made are still valid by the time a job is to be executed, with all the necessary resources remaining available.In order to provide a more accurate scheduling and to obtain a better balance between micro and macro goals some status information about the resources needs to be obtained. However, this brings up another controversial issue which has plagued all dynamic scheduling communities: at what resolution monitoring should be performed. Since jobs are constantly submitted throughout the grid and resources are used for processing such jobs, acquired monitoring information should be updated frequently. On the other hand, monitoring too frequently takes up valuable resources and bandwidth which could otherwise be used for job execution. Thus, another objective is to reach a balance between the risk of having outdated resource status information (which leads to incorrect scheduling decisions) and performing too much monitoring (wasting limited resources).In a conventional grid environment, such as the ATLAS project [‎40], system administrators are often required to monitor the activities of a selected group of preferred sites and submit jobs to those sites if deemed capable of processing such tasks. The sites chosen, however, may not necessarily be the best sites for processing the jobs. This results in underutilized resources and stagnant efficiency of sites as there is no global incentive for improving efficiency as well as remaining competitive. One of the main reasons for sub-optimal resource allocation is that when taking factors such as system resource utilization, response time, global and local policies into account while dealing with potentially independent sources of jobs, computational and storage resources, the job of managing resources and job scheduling becomes too tedious for a centralized entity to perform. On top of that, with the implementation of varying local policies, a centralized scheduling entity may not have access or control over such policies, hence it could only perform scheduling based on a best effort basis. It is often up to the job-receiving host to perform the final leg of scheduling, based on its locally defined policies. As such, scheduling within a grid is often a multi-tiered process where the job-submitting host performs its job scheduling to the best of its knowledge of the current state of the grid environment, while the receiving host takes over the final phase of the scheduling process. Dividing the task of scheduling amongst several sites would add the advantage of easing the load and complexity of performing scheduling at a single location. However, in order to motivate individual local domains in competing to become more efficient, in addition to being more aggressive in competing for accepting more jobs, some form of incentive mechanism could be applied. In this work, we explore a decentralized combinatorial exchange scheme, as well as pull-based grid scheduling methodology which adopts the use of brokers with job advertisement and propagation within a grid environment. The main motivation for this scheme is to create an automated two tiered scheduling methodology to perform the tedious task of performing service discovery, and task scheduling at the global level, while performing resource monitoring, utilization and efficiency control at the local level. To achieve the best attainable optimization at any point in time, participating sites are to remain motivated to offer their best services based on the job submitting host's preferred optimization settings. Global scheduling of jobs will be done at the broker level via a bidding process. The submitting host will have the privilege to choose the best available offer to suit its requirements. A pricing scheme is implemented as a trading mechanism in exchange for the services provided. This pricing mechanism will hence serve as a motivation for participating sites to compete for jobs so as to increase its overall wealth. As such, competing sites will be required to constantly monitor and improve their own resources, its utilization and efficiency so as to remain competitive.

Description

Keywords

Citation