Allocation Algorithms Problems In Mesh

2 downloads 0 Views 360KB Size Report
Moreover, the original Stack Based Algorithm (SBA) is compared with ISBA-algorithm. *Koło Naukowe SISK, ** Katedra Systemów i Sieci Komputerowych, ...
Author's copy

Processor allocation, mesh structure, algorithm, experimentation system.

Grzegorz CHMAJ* Dawid ZYDEK* Leszek KOSZAŁKA**

ALLOCATION ALGORITHMS PROBLEMS IN MESH-CONNECTED SYSTEMS

Methods of processor allocation described in this paper concentrate on speed and allocation efficiency. Frame Sliding algorithm was used to compare the speed of Improved Stack Based Allocation Algorithm. Improvement of Stack Based Algorithm includes Rotation Optimisation and Task Separation techniques. Also, experimentation system and efficiency coefficients were described. Results of investigations were shown and discussed in this paper.

1. INTRODUCTION Many computer systems have processors connected with each other by mesh. That kind of connecting is simple to extend and modify. Also, its structure is not complicated, so it can be easily implemented. To improve fit forming jobs (tasks) in that kind of system, the efficient allocation algorithm should be designed. There are many allocation algorithms available [4] and new ones are researched. However, an ideal allocation algorithm is still wanted [3]. Units in mesh are co-operating, so there must be used an algorithm, that will manage their work. System is equipped with tasks queue, where all of tasks waiting for allocation are placed. Allocation algorithm takes tasks from the queue and searches free submesh to allocate them. We assume that tasks are taken from a given queue by order of incoming. In this paper, two allocation algorithms are compared: Improved Stack Based Algorithm (ISBA) proposed by authors and Frame Sliding Algorithm (FS). Moreover, the original Stack Based Algorithm (SBA) is compared with ISBA-algorithm.

*Koło

Naukowe SISK, ** Katedra Systemów i Sieci Komputerowych, Wydział Elektroniki, Politechnika Wrocławska

The remainder of this paper is organized as follows: Section 2 describes allocation algorithms. In Section 3 the experimentation system is presented. Section 4 contains research results. Final remarks appear in section 5. 2. NOMENCLATURE AND ALGORITHMS Definition 1. Mesh M(w,h) is the rectangular mesh with h-columns and w-rows, containing the number of h times w nodes. Definition 2. Submesh S(z,v) is busy when all nodes in that submesh are allocated to job (or many jobs). Definition 3. Base of free submesh (BFS) is a node N(x,y) that can be used as a base to allocate incoming job. Definition 4. The reject area(RA) referred to job J is denoted by RJ. Use of any node from reject area will cause that allocated job J is crossing boundary of the mesh. Definition 5. The coverage (BS) of a busy submesh B referred to job J is denoted as EB,J. Use of any node from coverage EB,J will cause that job J will be overlapped with EB,J.

w

1 2 3 4 5 6 7 8 9

h

1 2 3 4 5 6 7 8 9

J(3,2) - busy node (includes coverage area) - reject area - coverage area - free node

Fig. 1. Mesh M(9,9) with two allocated jobs Definition 6. Base block referred to job J is denoted as BJ and is a submesh that all of nodes from this submesh can be used as a base of free submeshes for allocation of J. Definition 7. Busy list contains all of busy submeshes in basic mesh M. Definition 8. Sink of the rejected area is the node that has position (w-p+1, h-q+1) and determines maximum position of node that can be used as a base of free submesh We take the following assumptions: (i) There are no restrictions to the mesh size and to the size of task queue. (ii) Task queue acts as input data set and is filled once, before allocation process starts. (iii) The entire mesh can be released after end of allocation process performed for all tasks from the queue. (iv) Jobs must be allocated in such a way, that they do not overlap boundaries of mesh and do not intersect with

each other. (v) If the task cannot be allocated, algorithm returns failure and drops this job. Next, algorithm continues its work for further tasks from the queue. (vi) It is possible to do job shape rotation: e.g. J(4,2) may be allocated as J(2,4). (vii) Once allocated task constitutes BS. (viii) Algorithm should possesses complete recognition ability i.e. always to find a free submesh if it does exist in mesh. 2.1. FRAME SLIDING ALGORITHM (FS).

This algorithm utilises so called Busy List containing information whether (at the moment) each node in M is free or busy. FS performs two steps: firstly, it searches a node marked free. This operation starts at the beginning of mesh i.e. at the node N(1,1). Next, every successive node i.e. N(2,1),…,N(w,1), N(1,2),…is being checked. When free node is found, algorithm goes to the step two. During this step, FS is checking if a requested submesh can be allocated using found free node as BFS. If not i.e. N(c,r) belongs to RA, FS is searching next free node, starting from N(c,r). If node N(c,r) is outside RA, FS is checking if N(c,r) belongs to CJ. If it is true, algorithm searches another free node, starting from this node. If N(c,r) does not intersect with CJ, it can be used as a base to job J(p,q) and algorithm marks nodes assigned to J(p,q) as busy on Busy List. To obtain complete recognition ability FS is additionally equipped with job rotation mechanism. 2.2. IMPROVED STACK BASED ALGORITHM (ISBA)

ISBA uses manipulating of job orientation to obtain complete submesh recognition ability. However, when job J(p,q) has both p and q sizes equal (p=q) there is no need to change job orientation. Using stack as a storage for candidate blocks, algorithm returns first found base block as a result. The rest of stack is dropped. The proposed ISBA-algorithm is organized as follows: Find the sink (reject area RJ) with respect to job J Compute the coverage set CJ with respect to job J Perform spatial subtraction of RJ and CJ to specify initial candidate block called init initcov  next(0) Push init onto the stack While the stack is not empty do If stacktopcov = null, return stacktop else k  stacktopcov If stacktop intersects with CJ[k] then pop up stacktop from the top of stack subtract CJ[k] from stacktop for each candidate block resulted from spatial subtraction: bcov  next(k) push b onto the stack else stacktopcov next (stacktopcov) If p=q then return fail If orientation of job J was changed, return fail else if job J has not square shape, change orientation of job J and go to step 1

3. EXPERIMENTATION SYSTEM A functional block-diagram of the designed experimentation system for evaluation of the allocation algorithms is shown in Fig. 2. A

c T

c

Allocation

t N f

E1

Analysis of EE2 3 results E4

Efficiency analysis

E

s Fig. 2. System diagram

There are distinct the following variables: Inputs: c – category of tasks in a queue (sizes of tasks), T – the total number of tasks. System parameters: s - mesh size; s = h×w. Controlled input: A – allocation algorithm used. Allocation process outputs: t – the total time for all successfully allocated tasks, N – the total number of non-allocated tasks, f – fragmentation. Intermediate efficiency coefficients: E1 - efficiency of mesh fill, E2 - efficiency of allocation, E3 - efficiency of allocation time, E4 - the ratio of mesh size to the total allocation time. The index of performance: E - the total efficiency. The introduced measures of efficiency are expressed by (1) and (2):

E = E1∙ E2 ∙ E3 ∙ E4 where T N E1 =  100 % , s

E2 =

T N , T

E3 =

(1) 1 , t

4

E4 =

s st t

(2)

The computer experimentation system consists of program modules (Fig. 3). Programs are written in C++ and are implemented in MS DOS® environment. Therefore, the allocation time (very important variable in efficiency analysis) can be measured more precisely than in the case when other environment is utilised. Prepare tasks to allocation

Move queue to experimentation system

Begin allocation process

Write results to file

Visualize allocation process

Analyze and process results Calculate total efficiency

Fig. 3. Experimentation system structure.

4. INVESTIGATIONS Efficiency and performance studies were made as comparison between ISBA and FS algorithm. In this section we present only examples of research made. All presented experiments were carried out for the mesh of 230x230 size. 4.1. SELECTING BASE FROM BASE BLOCK

SBA searches base block, suitable to allocate job J. Every node of base block can be used as a base of submesh referred to job J. Node selection strategy determines, which node of base block will be returned as a result – base for submesh. Possible node selecting methods are: left-upper node (a), left-down node (b), right-upper node (c), right-down node (d), randomly selected node (e). Method (e) – random select of base node is not efficient. It causes higher fragmentation, and not efficient usage of mesh. Results of allocation using method (e) and method (a) are shown in Fig. 4. As we can see method (a) causes less fragmentation.

a)

b)

Fig. 4. Fragmentation in methods (a) to the left and (e) to the right.

In this paper, the implementation of ISBA uses left-upper select method or downright and left-upper when using Task Separation. 4.2. INFLUENCE OF TASK QUEUE SORTING. To perform this test, we used random sets of jobs. During first experiment jobs were sorted ascending by size. Results are shown in Fig. 5. It may be observed that ISBA algorithm allocates first nine jobs in diversified times. 100000

1000

Algorithm FS Algorithm SBA

100

Mesh utilization 87%

10

EISBA

EFS

20,86812

7,279365

31

21

26

11

16

1

1

6

Allo catio n time

10000

Quantity of allocated jobs

Fig. 5. Allocation times of tasks sorted ascending by size

It caused minimal overhead in first phase of allocation. When allocating next jobs, allocation time is slightly increasing. This growth is not high, so we can confirm that increasing of jobs’ sizes does not influence to ISBA’s efficiency. FS algorithm is faster till some job (threshold job), but for every next allocated job, allocation time is growing exponentially. FS allocated all jobs from input queue in much higher time. It’s worth to note, that the time of allocating whole queue (T=35) is comparable with allocation time of this set of jobs sorted descending by size (Fig. 6).

Allocation time

10000

1000

Algorithm FS

100

Algorithm SBA

Mesh utilization

EISBA

EFS

87%

21,49416

7,577819

10

33

25

29

17

21

9

13

1

5

1

Quantity of allocated jobs

Fig. 6. Allocation times of tasks sorted descending by size

Total time of allocation

We can see that in both cases (ascending and descending) E-coefficient (1) for FS is almost three times smaller than for ISBA and the total allocation time of the considered task queue was much bigger for FS than for ISBA (Fig.7). 1000000

FS - growing size of tasks

100000 10000

SBA - growing size of tasks

1000

FS - decreasing size of tasks

100

SBA - decreasing size of tasks

10 1

Type of algorithm

Fig. 7. The time t for different types of algorithms.

4.3. INFLUENCE OF JOBS SIZE ON ALLOCATION EFFICIENCY

Now, jobs have not been sorted. Two categories of tasks were taken into consideration. C1-category contained small (p, q < 10) jobs. C2-category contained big (p, q>>10) jobs. The resulting values of E (table below) shows that for C1 the efficiency is small for both algorithms. In ISBA it is caused by necessity of computing

Task category C1 C2

EISBA 8,022789 23,75041

EFS 3,255952 4,641667

a lot of coverages (for every job allocated in mesh). For FS it is caused by necessity of reviewing big area of mesh for every job. For C2 the difference between E-efficiency is remarkable. ISBA is quick but FS is still reviewing large areas of mesh. 4.4. FILL OF MESH AND BEHAVIOUR OF ALGORITHMS

Fig.7 presents allocation times of jobs in relation to mesh fill and jobs’ sizes. Experiments were performed for jobs with the same size: q=p, p=25, p=30 and p=35.

SBA - utlization of mesh 21%

10000

FS - utlization of mesh 51%

1000 100

SBA - utlization of mesh 51%

10 1

Type of algorithm

Size 35

FS - utlization of mesh 21%

FS - utlization of mesh 76%

Total time of allocation

Total time of allocation

Size 30

SBA - utlization of mesh 21%

10000

FS - utlization of mesh 51%

1000 100

SBA - utlization of mesh 51%

10 1

Type of algorithm

SBA - utlization of mesh 76%

Fig. 8. Allocation time used to fill mesh with jobs of size p=30 and p=35.

The obtained values of E for these cases are presented in tables below. p Mesh utilization EISBA EFS p Mesh utilization EISBA EFS p Mesh utilization EISBA EFS

FS - utlization of mesh 21%

25 21%

51%

76%

8,20032 5,156472

12,33646 7,671771

15,15031 9,256746

30 21%

51%

76%

9,21213 5,392068

14,90603 8,305605

18,32131 9,996480

35 21%

51%

76%

10,34166 5,670867

17,50706 8,914066

21,53782 10,830980

FS - utlization of mesh 76% SBA - utlization of mesh 76%

For jobs with smaller sizes advantage of ISBA over FS is not remarkable. The highest speed profit will occur when we use ISBA to allocate large-size jobs. Moreover, it is worth to note that the total allocation time is growing exponentially when fill of mesh is increasing (for both ISBA and FS algorithms). 5. FINAL REMARKS The effectiveness of presented algorithms is different for miscellaneous situations. ISBA is more efficient in most cases, but its disadvantage is fragmentation – sometimes ISBA will allocate fewer jobs than FS. The solution to this problem is to use algorithms to reduce fragmentation. The TS algorithm was proposed. FS can be used in small systems, when speed is not required. Future work in this area will be concentrated on (i) making more experiments for analysing different aspects of efficiency, (ii) adopting ISBA-algorithm to 3D-meshes, (iii) preparing new program modules of the experimentation system e.g. database module for storing results of experiments. REFERENCES [1] S. BYUNG, R. CHITA, A Fast and Efficient Processor Allocation Scheme for Mesh-Connected Multicomputers, IEEE Trans.on Computers, vol. 51, no.1, January 2002. [2] Blue Gene Project, http://research.ibm.com/bluegene/index.html, 2001. [3] C. CHANG, P. MOHAPATRA, An Integrated Processor Management Scheme for the MeshConnected Multicomputer Systems, Proc. Int. Conf. on Parallel Processing, August 1997. [4] P. BABBAR, P. KRUEGER, A Performance Comparison of Processor Allocation and Job Scheduling Algorithms for Mesh-Connected Multiprocessors, Proc. of Sixth IEEE Symposium on Parallel and Distributed Processing, October 1994. [5] T. LIU, K.W. HUANG W, F. LOMBARDI and L.N. BHUYAN, A Submesh Allocation Scheme for Mesh-Connected Multiprocessor Systems, Proc. 1995 Int’l conf. Parallel Processing, vol. II, August 1995.