Data Scheduling for Multi-item and Transactional Requests in On ...

3 downloads 3676 Views 304KB Size Report
broadcast system, clients explicitly request for data items from the server. ... Broadcast. Laptop. Laptop. Laptop. Figure 1. On-demand Broadcast System. In this paper we ... Permission to make digital or hard copies of all or part of this work for personal or ... issues such as consistency, application recovery, etc., which were.
Data Scheduling for Multi-item and Transactional Requests in On-demand Broadcast Vijay Kumar

Nitin Prabhu

SCE, Computer Networking University of Missouri-Kansas City Kansas City, MO 64110

SCE, Computer Networking University of Missouri-Kansas City Kansas City, MO 64110

[email protected]

npp21c @umkc.edu

demands. Unlike conventional unicast approach, it offers high scalability and is capable of satisfying multiple requests for the same data item at once which obviously leads to efficient bandwidth utilization. One of the main components of broadcasting systems is broadcast scheduling algorithms, which significantly affect data latency and access time and is the topic of this paper. In order to develop the motivation, we first categorize existing broadcast approaches and identify their limitations.

ABSTRACT Recent advances in mobile computing and wireless communication have enabled the deployment of broadcast based information systems such as, wireless internet, traffic information, etc. The users and research community have recognized its potential for meeting the growing information demands of the future. At present existing systems are mainly pull-based (ondemand) and their performance highly depends on the broadcast schedule they use. Previous studies in on-demand scheduling have focused mainly on single item requests to keep the investigation simple. However, scheduling algorithms for single item request are unable to manage efficiently multi-item requests which are becoming more common. In addition to this these more and more requests are becoming transactional in nature. In this paper we take into consideration these requirements and study scheduling problems arising in on-demand broadcast environment and propose an efficient algorithm. We report its performance and demonstrate that our algorithm successfully manage multiitem simple and transactional requests and significantly reduces the wait time, tuning time and avoids transaction aborts.

Broadcast systems can be categorized into (a) push-based, (b) pull-based, and (c) hybrid. In push-based systems servers periodically broadcast a schedule which is computed offline using clients’ access history which could be found in their profile. This approach is also referred to as static broadcast which does not take into account the current data access pattern by users for composing the next broadcast. Its performance is therefore significantly affected when user access pattern deviates from the one which was used to construct the broadcast schedule. In pullbased systems, which are commonly referred to as on-demand broadcast system, clients explicitly request for data items from the server. The server compiles the requests in service queue and broadcasts the data based on the number of pending data item requests. Unlike push-based system, the pull-based systems perform better mainly because they make decision based on current user access pattern. Hybrid systems apply a mixture of push and pull approaches where rarely accessed data items are classified as pull and commonly accessed data items are periodically pushed.

Categories and Subject Descriptors H.3.5[Online Information service] , H.3.4[Systems and Software]

General Terms Algorithms, Design, Performance

Broadcast

Keywords Mobile, On-demand, Scheduling algorithm, Transaction.

1. INTRODUCTION Laptop

Data broadcasting through wireless channels is becoming a common way to reach information seekers to satisfy their

Server

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MDM’05, May 9–13, 2005, Ayia Napa, Cyprus. Copyright 2005 ACM 1-58113-000-0/00/0004…$5.00.

Laptop

Laptop

Uplink

Figure 1. On-demand Broadcast System In this paper we focus on on-demand multi-item broadcast system because (a) it makes efficient use of available bandwidth, (b) it allows timely access to data, and (c) majority of users frequently seek specific information. Earlier works on scheduling

48

algorithms have considered the case of only single-item requests but in reality clients invariably attempt to download multiple data items to process user requests. For example database clients often access multiple data items to complete a read transaction. Similarly web clients access HTML document along with all its embedded objects. Motivated by such trends, we propose ondemand scheduling algorithm for multi-item and transactional requests. Note that transaction oriented requests have to handle issues such as consistency, application recovery, etc., which were not considered in previous on-demand systems. Our contribution in this paper for transactional requests are (a) the development of an efficient on-demand scheduling algorithm which guarantees consistent view to transactions and (b) the identification of new performance metrics for performance comparison. We study its behavior with a detailed simulation study and show that it offers superior performance compared to the existing algorithms.

R×W value for the next broadcasting. Our work has some relationship with R×W approach, however, we have considered multi-item and transactional requests, which introduces a number of new issues. In [5], author proposed a scheduling scheme for push based system with multi-item requests. It uses the access pattern dependencies (for example data items di and dj are always accessed together) of the request and their access probability and presents a heuristic to compute a static offline broadcast schedule. In [19] authors studied the problem of consistent data retrieval at the mobile client however the paper does not address problem of scheduling data for multi item request. Moreover consistency is issue is studied for push-based system. To the best of our knowledge so far no existing work has reported investigation on on-demand scheduling for multi-item and transactional requests.

The remainder of this paper is structured as follows: Section 2 reviews related works, section 3 discusses new issues in transactional requests and motivates the need for new scheduling algorithm. In Section 4 we define new performance metrics for transactional request and present our algorithm. Section 4 we discuss our simulation results.

In this section we explain the system model and show with an example that on-demand scheduling schemes for single item requests are not suitable for transactional requests. Figure 1 shows the architecture of a typical on-demand broadcast system [14]. There is a single server, which supports a large client population where a client sends queries or update transactions to the server through uplink channel. The server broadcasts relevant information in response to queries over the satellite downlink from where user retrieves the result. Update transactions are executed at the server and the new values are broadcasted if there are requests for them. Similar to previous works on broadcast scheduling we assume that all data items are locally available on the server and they are of equal size and hence they have equal service time. The broadcast duration of a data item is referred to as broadcast tick.

3. BACKGROUND AND MOTIVATION

2. REVIEW OF PREVIOUS WORK In this section we review a number of push-based [1-5, 7, 1113, 15-17] and pull-based [1, 7, 15, 16] scheduling algorithms. In [2, 12, 17] push-based scheduling algorithms for broadcast are proposed. In these approaches the server delivers data using a periodic broadcast program, which is based on the estimation of access probability of each data item. Its usefulness is limited to static environment where access probabilities do not change often. In [3] broadcast scheduling algorithm for hybrid (push-pull) environment are proposed where the server periodically broadcasts using a broadcast program. A part of channel bandwidth is reserved for data items which are to be pulled by the client. The client issues a request to server only after it has waited for a predefined duration for data to become available in periodic push based broadcast.

One of the limitations of current broadcast scheduling schemes is that they make broadcast decision at the data item level, that is, it composes a broadcast by considering only the requested set of data items. It does not consider the set of transactions that requested them. We illustrate this with the data set given in Table 1 which lists a number of transactions (ti) and the data items (di) they want from the server. For example, transactions t1 wants data items d1, d2, and d7. The last row of the table titled “Total” records the total data requirements of each transaction. For example, data item d1 is required by transactions t1, t3, and t4. The server broadcasts data items requested by the clients using a scheduling algorithm.

Pull based scheduling algorithms FCFS (First Come first Serve), MRF (Most Request First), MRFL (Most Request First Lowest) and LWF (Longest Wait First) were studied in [15, 16]. In FCFS, pages are broadcasted in the order they are requested. In MRF, page with the maximum pending request is broadcasted. MRFL scheme is similar to MRF, but breaks ties in favor of page with lowest request access probability. In LWF scheme the waiting time, which is the sum of waiting time of all pending requests for the page, is calculated for all the pages and the page with the longest waiting time is broadcasted. It is shown here that FCFS performs poorly compared to other algorithms in broadcast environment when the access pattern is non-uniform. In [7] authors studied on-demand systems where requests were associated with deadlines and reported that on-demand broadcast with EDF (Earliest deadline First) policy performs better. In [1] authors proposed a scheduling algorithm, referred to as R×W (R stands for number of pending requests for a data item and W stands for the waiting time of the first request listed in R) that combines FCFS and MRF heuristics. The algorithm computes the product of R and W and selects the data item with the maximum

Table 1. Transaction data request table

49

d1

d2

t1

1

1

t2

1

1

t3

1

t4

1

1

t5

1

1

Total

5

4

d3

d4

d5

d6

d7 1

1 1

1 1

1

1 1

3

2

1

1

1

Scheduling decision based on data item level causes consistency problem and also increases the access time of the transactional request. We show this using the following example where we use MRF (Most Request First) [15] scheme because like FCFS, R×W [1], etc., it takes broadcast decision at data item level. In MRF the data item with maximum pending requests is scheduled next, when there is a tie data item with a earliest first request time is selected. Figure 2 illustrates a broadcast sequence where data item d1 will be scheduled first in one broadcast tick, d2 will be in the next broadcast tick, and so on.

d1

D2

d3

d4

D5

d6

database state is stated as . In a database, data items are related through a number of integrity constraints and database must satisfy them to be consistent. A client transaction should read all required data items from a consistent database state. Transaction aborts are comparatively more costly in mobile environment due resource constraints. Hence number of aborts becomes an important performance matrix for transactional request. There should be mechanisms to apply updates at periodic intervals such that client transactions get consistent view of the database.

d7

Transaction wait time

Figure 2. MRF Schedule

Transaction seek time

Consider transaction t1, which needs d1, d2 and d7. Transaction t1 gets d1 and d2 in first two broadcast ticks and has to wait for d7 which arrives in the 7th broadcast tick. This is because the scheduling algorithm makes decision at data item level. It has the following two important consequences: First, for a single data item the transaction t1 had to wait till the 7th broadcast tick. Second, in between 2nd and 7th broadcast ticks if there are updates to either d1 or d2 or both then transaction t1 gets two different views of the database. Data item d7, which the client will download, may be related to the new updated values of d1 or d2 but the client has old values (stale) of d1 and d2, consequently the transaction has to be aborted. Consistency problem is not only relevant to database clients but also to web clients. Web pages usually contain related data items and if a web page is updated after a client has downloaded few of the objects of the web page then the web client gets an inconsistent view of the web page. We have tried to overcome this problem in our scheduling scheme.

a

Ti request Arrived

Transaction Span

b

First data Item of Ti Broadcasted

c

Last data Item of Ti Broadcasted

Figure 3. Response time of transaction Response time: Response time is one of the most common measures of any scheduling algorithm and for transactional requests it includes (a) transaction wait time, (b) transaction seek time and (c) transaction Span. Figure 3 illustrates these parameters and their relationship. We define these parameters as follows: Transaction Seek Time: It is the time expired since a request is sent to the server until the first data item of the transaction is broadcasted. Thus transaction seek time = b-a (Figure 3).

In existing scheduling algorithms for on-demand broadcast, clients have to continuously monitor the broadcast to download their requested data items. There is no provision for broadcasting data index for the following reasons. First the scheduling decision is made at every broadcast tick, consequently, it cannot predict the data items to be broadcasted in immediate future broadcast ticks. Second there is no notion of periodicity and broadcast cycle. Mobile clients because of power constraints may not afford to continually monitor the broadcast. With the use of index the mobile client can check if its required data is present in the current broadcast cycle and if it is not then the mobile client can go to sleep and tune in the next broadcast cycle for index. We propose the use of indexing in our on-demand scheduling algorithm.

Transaction Span: It is the time difference between the broadcast of the first and the last data items of the transaction. Thus transaction span = b-c. Consistency issue is related to transaction span. With an increase in transaction span the chances of transaction having an inconsistent view of the data base increases. Hence the scheduling algorithm should aim to reduce the transaction span. Wait time of transaction: It is the difference in time when the transaction sends its data requests and the time when the last data item of the request is broadcasted. It is represented in terms of transaction span and transaction seek time as follows. Wait time of transaction = Transaction Seek Time + Transaction Span

4. OUR SCHEDULING ALGORITHM We first define a set of new measures and use them for the development of our algorithm and also for performance comparison because the traditional performance metrics for single item request cannot be used for evaluating multiitem request. Our new performance metrics are related to two main issues – transaction consistency and response time.

Tuning Time: It is equal to the total time spent by the client listening to the broadcast channel to access the data items. The tuning time for accessing data determines the amount of time spent by the mobile unit in active mode. Tuning time is the measure of the power consumed by the client to retrieve the required data. In existing on-demand algorithms there is no provision for creating index consequently the tuning time is equal to the total wait time of the transaction. Our scheduling algorithm has provision for broadcasting index and reduces client-tuning time.

Consistency: The server broadcasts items from the database. A database state is typically defined as a mapping of every data item to a value from its domain. Thus a

50

are broadcast together. This helps in reducing the transaction span and tuning time. The data is broadcasted in the order determined along with the index. The index contains the content and order of the data in the broadcast cycle. Update transactions that are received during a broadcast cycle are queued. They are executed at the end of the broadcast cycle and serialized in the order of their arrival. As a result the state of database changes only at the end of the broadcast cycle. Data broadcasted within a broadcast cycle will always belong to the same database state and hence is consistent. If client downloads data from two different broadcast cycles then it would get inconsistent view of the database. Our algorithm at the client ensures that client downloads entire data set TDi of the transaction from a single broadcast cycle by using the index broadcasted by the server. The algorithm at the client ensures consistent view for the client transactions and avoids aborts. The client after sending the request to the server monitors the broadcast for the next index and downloads it. If the client does not have its required data in the current broadcast cycle then it can sleep until the end of the current broadcast cycle and tunes for the index in the next broadcast cycle. The details of the procedure are explained in the following sections.

We define n = Database size in terms of total number of data items in the database D = {d1, d2, d3, …, dn}, Rdi = number of requests for data item di. Data items accesses by transaction ti = TDi, where TDi ⊆ D. numi = number of data items accessed by transaction, numi = |TDi| . Tavg = average transaction size in terms of data items. We propose a new heuristic for defining the priority of a transactional request called temperature of transaction. Temperature of a Transaction: The temperature of a transaction ti, denoted Tempi, gives the measure of the number of hot data items (frequently accessed data items) that a transaction accesses. Tempi is defined as an average number of requests per data item of the transaction. A temperature indicates that a transaction has accessed more number of hot data items and it is defined as: Tempi = ∑ Rdi / numi for all di ∈ TDi Example: We calculate the temperature of transaction t1 with TD1= {d1, d2 , d7} (Table 1), Rd1 = 3, Rd2= 3, Rd3= 1 and numi = 3 is = (3 + 3 + 1)/3 = 2.33

4.1 Protocol at the Server In this section we explain the steps of our scheme which the server executes. We define Request which identifies the set of transactions requested by clients. Bset identifies the set of data items selected for broadcasting in the current cycle and Tlist identifies the set of transactions that are used to fill the current Bset.

We use the measure (Tempi ×Wi) where Wi is the wait time of the transaction ti. In previous algorithms [1, 7, 15, 16] for on-demand systems the scheduling decisions were taken at every broadcast tick. In our algorithm we take scheduling decisions at periodic intervals which we refer to as broadcast cycle. We use the notion of broadcast cycle for introducing periodicity in broadcast. Broadcast cycle concept has been used in scheduling algorithms of push-based system [2, 3, 5, 12] but has not been used in ondemand system. In push-based system the content and organization of a broadcast cycle (referred to as schedule) is same in every cycle and the same schedule is repeatedly broadcasted. However, unlike push-based system, in our scheme content and organization of each broadcast cycle may vary depending on the current workload at the server. We use broadcast cycle notion for dual purpose: Firstly, for introducing periodicity in broadcast so that indexing can be used. Secondly, it is used as interval after which updates at the server can be applied to the database. Each transaction requires varying number of data items. Hence we cannot define exact number of data items in a broadcast cycle as in the case of single item requests. Broadcast cycle is an interval of broadcast ticks whose length varies between K to (K+Tavg) broadcast ticks. In the beginning of the broadcast cycle the schedule for the current cycle is formed based on the current request volume at the server. The server calculates temperature for all transactions that were requested by the client population. The transactions are sorted based on their (Tempi ×Wi) values. First N transactions are selected from the sorted list such that their total data item requirement does not exceed (K + Tavg). The arrangement of data items within the broadcast affects the transaction span and tuning time. The data items are arranged in a broadcast cycle such that transactions whose data sets TDi overlap

Steps

i.

Calculating Temperature of transactions: In the beginning of the broadcast cycle the schedule for the cycle is formed based on the current request volume at the server and the server calculates temperature of all the transactions in the Request set.

ii. Sorting the Request list: Transactions in the Request set are sorted in descending order at the beginning of every broadcast cycle using (Tempi ×Wi) values where Wi is the wait time of the transaction ti since its arrival at the server. iii. Selection for transactions for current broadcast cycle: Transactions are selected sequentially from the top of Request set until the total data requirement of these transactions exceeds the length (K) of the broadcast cycle. The contents of set Tlist and Bset are identified here and the selected transactions are added to the Tlist set and the data items belonging to those transactions are added to the Bset.

51

Index Structure Type Major index pointer Data item pointers Broadcast Cycle Major index Data items Minor index 1 Data items Minor index 2 - - - Minor index m Data items Major index

Figure 4. Indexing Structure Procedure a. b.

i.

Let Bset = ∅, Tlist=∅ While( | Bset | < K) i. Select next transaction ti from the sorted Request set. Tlist= Tlist ∪ ti. Request = Request – ti. ii. Bset = Bset ∪ (TDi – (Bset ∩ TDi)

ii. Tlist= Tlist - ti. iii. If (Tlist = ∅) then Exit.

iv. Arrangement of data items with in the broadcast cycle: The arrangement of data items within the broadcast cycle affects the transaction span and transaction waiting time. Broadcast denotes the ordered set of data items to be broadcasted in the current cycle. Initially the transaction with the highest value of (Tempi×Wi) is selected. Its data items are added to the Broadcast set. Next transaction ti ∈ Tlist, selected is such that ti has maximum overlap with the Broadcast set compared to other transaction tj ∈ Tlist. We use the measure (Overlapi/Remi) for measuring the overlap of the transaction, where Overlapi is the overlap of the transaction with the broadcast set and Remi is the number of transactions not selected in the Broadcast set till now. The transaction ti with the maximum value of (Overlapi/Remi) is selected. If there is a tie among the transactions then transaction ti with higher value of (Tempi ×Wi) is selected. The data items are broadcasted in the same order, as they are present in Broadcast set. The algorithm is presented below.

v.

Procedure Broadcast = ∅

b)

Select transaction ti with the highest Tempi ×Wi value.

c)

Add the data items of the selected transaction that are not yet added to the Broadcast set and remove the transaction from the Tlist

Calculate for every transaction ti ∈ Tlist

d)

| Bset | will be in the range: K ≤ | Bset | ≤ (K+ Tavg)

a)

Broadcast= Broadcast ∪ (TDi – (Broadcast ∩ TDi)

52

i.

The overlap of the transaction ti with the Broadcast set as: overlapi = Broadcast ∩ TDi

ii.

Number of Data items remaining to be broadcasted denoted remi remi = TDi - overlapi

e)

Select transaction ti with the highest value of (overlapi/remi). If there is a tie in the values (overlapi/remi) among transactions then from those transactions select the one that has the highest Tempi×Wi value.

f)

If | Broadcast | ≤ K then Go to step c

Indexing: Index is a directory of the list of data items, which are to be broadcasted in the broadcast cycle. We adapt (1, m) indexing [6] mechanism for our algorithm. In this method entire index is broadcasted at the beginning of every broadcast cycle and then after every (K/m) broadcast slots, where K is the length of the current broadcast cycle. We refer to the index broadcasted at the beginning of every broadcast cycle as major index and index broadcasted inside broadcast cycle after every (K/m) broadcast slots as minor index. The indexing structure is shown in figure 4. All indexes identify themselves, whether it is a major index or a minor index and contain pointer to the location of the next major index. The minor index contains the list of data items that are not yet broadcasted in the current broadcast cycle. In push based system there is no concept of minor index. All indexes in the pushbased system are of the same size and contain list of next K element to be broadcasted. In our algorithm the ith minor index within the cycle, will contain the list of (K-i×(K/m)) data items yet to be broadcasted

in the cycle. The major index and all minor indexes that are broadcasted are stored at the server until the end of the current broadcast cycle. The Minor indexes are also used for filtering out transactions that were not selected in Tlist but were satisfied in the current broadcast cycle.

4.2 Protocol at the client Client sends request for transaction ti with data set TDi to the server and after sending the request it tunes the broadcast channel for downloading the index. There are two possible cases (a) the index could be a major index or (b) it could be a minor index. The client checks if all the data required by the transaction are present in the current index and if it is then it tunes in the current cycle to download its required data item. If the index downloaded was the major index then client sleeps for the entire current broadcast cycle and tunes in for the next major index that would be broadcasted in the next cycle. If the index was a minor index then it sleeps for the remaining part of the current cycle and tunes in to download the major index from the next broadcast cycle.

vi. Broadcast data: Data is broadcasted in the order determined in Broadcast set in step v. Major index is broadcasted followed by the data. Thereafter minor index is broadcasted after every (K-i× (K/m)) broadcast tick. vii. Filtering Transactions: At the end of current broadcast cycle transactions are removed from the Request set which were not selected in Tlist during formation of contents of current broadcast cycle but still are satisfied completely in the current cycle. Transactions that are filtered out belong to two categories. First, transactions that arrive before the broadcast of the current cycle. Second, those transactions that arrive after the broadcast of the current cycle has began. To filter the transactions of the first type, check if the all the data required by the transaction are present in the Bset of the cycle. If for transaction ti, TDi ⊆ Bset, then ti is removed from Request set. To filter out transactions that arrived during the current broadcast cycle we use the first Minor Index that was broadcasted after the transaction had arrived. If that minor index contains all the data required by the transaction ti then ti is removed from the Request set. If no minor index was broadcasted after the transaction arrived then the ti is retained in the Request set.

Procedure •

Case 1: Transaction arrived in the current broadcast cycle. a.

Tune to download the next Minor Index.

b.

If (TDi ∩Minor Index = TDi) then Tune and download required data Else Sleep and tune in for the Next Major Index.



Case 2: Transaction arrived in the previous broadcast cycle and no minor index was broadcasted a.

If (TDi ∩Major Index = TDi) then Tune and download required data. Else Sleep and tune in for the Next Major Index.

5. EXPERIMENTAL RESULTS 5.1 Simulation Environment

Procedure

Table 2. Simulation Parameters and settings. Symbol Default 10 transaction/tick [5-25] λ DBSize 10000 Tavg 6 Tmax 12 Period 5000 [3000-7000] Pu 0.1 Uavg 6 Umax 12

Let Tbegin be the time when the broadcast cycle started. Let Ti be arrival time of transaction ti. For all the transaction which arrived till the end of the current broadcast cycle. If (Ti < Tbegin) then {If (TDi ∩ Bset = TDi) then Request = Request – ti.} Else // Transaction arrived after the broadcast cycle began that is Ti > Tbegin

We used simulation model written in CSIM [8] to compare its performance with other algorithms. The model represents environment similar to that described in Section 3. The broadcast channel is modeled as a server with fixed broadcast rate. We do not specify an absolute value for this rate but use broadcast tick to measure simulated times. This approach emphasizes that the results are not limited to any particular bandwidth and/or data item size but describe tradeoffs among algorithms. Scheduling overhead is not included in the results here. In other words we

i. Select the next minor index (MI) that was broadcasted after time Ti. ii. If (TDi ∩ MI = TDi) then Request = Request – ti.

53

distribution rule. That is 80 percent of the transactions access 20 percent of the data items. The requests are distributed over a database containing DBSize fixed-size pages. The average size of the transaction is 6. The update transactions have the same access pattern as read only transactions.

5.2 Comparison with FCFS, MRF and R×W scheduling Schemes We compare the performance of our scheme with MRF (Most Request First) and FCFS (First Come First serve and R×W scheduling scheme. MRF broadcasts the pages with the maximum pending request. FCFS broadcasts the data of the transaction in order they arrived. If a data item required by a transaction is broadcasted after the transaction request had arrived, then that data item is not broadcasted. R×W is explained in section 2. Figure 5a shows the comparison of Average Transaction wait time for different scheduling algorithms. The transaction wait time for all the schemes does not vary much to the change in transaction arrival rate. For all the schemes transaction wait time increases initially from for λ from 5 to 10 transaction/tick. Thereafter the increase in λ does not affect the transaction wait time. Our scheme results in 19% reduction in average transaction wait time compared to FCFS scheme and results in 17% reduction compared to MRF and R×W. Figure 5a shows the comparison of Average Tuning time for different scheduling algorithms. Similar to transaction wait time, Tuning time per transaction request for all the schemes does not vary much to the change in transaction arrival rate. Our scheme 55results in approximately 87% reduction in average tuning time compared to FCFS, MRF and R×W schemes Figure 6a shows the number of transactions aborted in each scheme due to inconsistent view of the database. In our simulation for FCFS, R×W and MRF scheduling schemes we use immediate propagation [3] method for communicating updates to the client. In this method server broadcasts invalidate message for a data item as soon as that data item is updated at the server. If a client receives a invalidate message for a data item which it has already downloaded and if the entire data set of that transaction to which that data item belongs is not yet received then that transaction is aborted. In FCFS nearly 65% of the transactions are aborted. In MRF and R×W nearly 80% of the transactions are aborted. Since in our scheme we execute the updates at the end of the broadcast cycle no client transaction is aborted.

(a) Transaction Arrv. rate Vs Avg. Wait time

(b) Transaction Arrv. rate Vs Avg. Tuning time

5.3 Effect of Broadcast Period on Transaction Wait and tuning time

Figure 5 assume that all the algorithms are able to make scheduling decisions fast enough to keep the broadcast bandwidth fully utilized. In the model, the client population is represented by a transaction request stream. We use an open system model since our work is aimed at supporting large dynamic client populations, such client populations cannot be modeled with a closed simulation system. The cost of using the back channel for sending the transaction request to the server is small and hence not modeled. The main parameters and settings for the workloads used in the experiment are shown in Table 2. The client population model generates requests with exponential inter-arrival times with mean λ. The access pattern is shaped with 80:20[9]

Figure 6b shows the effect of broadcast period on our scheduling scheme. In this experiment we vary the Broadcast Period from 3000 to 7000 broadcast ticks. We observed that Average Transaction wait time decreases with increase in broadcast period. This is because client transaction has to completely download all data with in a broadcast cycle. As size of broadcast cycle decreases less number of client transactions would be satisfied within a broadcast cycle. The average waiting time decreases till period of 5000. The average waiting time is minimum for broadcast period value of 5000. At broadcast period between 3000 to 4000 we observe that the average waiting time is

54

We observed that average client tuning time per transaction increases with increase in the broadcast period. The tuning time increases from 846 ticks to 1050 ticks as broadcast period is increased from 3000 to 7000.As the size of the broadcast cycle increases the transaction span increases and results in increase in transaction tuning time. Also there is one more effect of broadcast period. As the broadcast period size increases the execution of update transactions at the server is delayed.

5.4 Effect of shift in hot spot on transaction waiting time Figure 7 shows the effect of shift in hot spot on the average waiting time of the transaction. In our simulation setup 20% of the data items is hot spot and rest of the data items are considered as cold. In this experiment we shift the hot spot of the database by 1000 data items (10% of DBSize) after periodic interval referred to as ‘shift interval’. In this experiment we observed that at higher shift intervals (15000 and above) the average waiting time for the transaction was observed to be the same as that with no shift in the hot spot. (b) Broadcast Period Vs Tuning Time and Waiting time

Figure 7: Hot spot shift interval Vs Average waiting time of Transaction

(a) Transaction Arrv. rate Vs No. of Transaction aborted Figure 6

However at shift interval less than 2*(Broadcast Period) the average waiting time of the transaction increases. With broadcast period of 4000, 5000 and shift interval of 5000 it was observed that the average waiting time increased by approximately 18% compared to without shift in hot spot. With broadcast period of 6000, 7000 and shift interval of 10000 it was observed that the average waiting time increases by approximately 20% and 38% compared to without shift in hot spot. It was observed that higher the broadcast Period, higher is the performance degradation. This is because at higher broadcast period the scheduling decisions are made after a longer interval of time. As a result the current broadcast schedule does not completely represent the current

at about 30% higher than at 5000. This is because at lower broadcast period the chances of transaction getting all their data with in a broadcast period are less. Hence client transactions would have to wait for more broadcast cycles. For periods greater than 5000 we observe that average wait time increases by about 6%. Following factors contribute to that (a) Data item is broadcasted only once in a broadcast period (b) For transactions that arrive during the period, the probability that its data items would already have been broadcasted increases with increase in broadcast period. Hence the transaction cannot get its entire data set in the current cycle and will have to wait till the next cycle.

55

request workload at the server in case of rapid change in hot spots.

[7] Xuan, Sen, Gonzalez, Fernandez, Ramamritham. “Broadcast on-demand: Efficiently and timely disseminating of data in mobile environment”. IEEE RTAS’97.

6. CONCLUSIONS AND FUTURE WORK

[8] Schwetman, H. “CSIM: A C-Based, Process-Oriented Simulation Language”, Proceedings of Winter Simulation Conference, 1986.

In this paper we have studied the problem of scheduling multiple items and transactional requests in an on-demand broadcast environment and proposed a scheduling algorithm for managing such requests. Previous works in this context on single item request assumed prior knowledge of static client access pattern, which may not always be the case in reality. Hence our work is not only complimentary to the previous works in this area but discovers new insight in information dissemination. We showed through simulation that our algorithm performs better than the common single item scheduling algorithms. At present we are working on the optimization problem for dynamically determining the optimal size of broadcast cycle period.

[9] J. Gray, P. Sundaresan, S. Englert, K. Baclawski, P. Weinberger, "Quickly Generating Billion-Record Synthetic Databases" Proc. ACM SIGMOD Conf., Minneapolis, MN, May, 1994. [10] G. Herman, G. Gopal, K. Lee, and A. Weinrib. “The Datacycle architecture for very high throughput database systems”. In Proceedings of ACM SIGMOD, CA, 1987. [11] K. Stathatos, N. Roussopoulos and J. S. Baras, "Adaptive Data Broadcast in Hybrid Networks ", Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997. [12] S. Su, L. Tassiulas, "Broadcast scheduling for information distribution", INFOCOM, 1997.

7. REFERENCES

[13] Murat Karakaya , "Evaluation of a Broadcast Scheduling Algorithm", Lecture Notes in Computer Science, 2151, 2001.

[1] Aksoy, D., and Franklin, M. Scheduling for Large-Scale OnDemand Data Broadcasting. In Proceedings of IEEE Infocom, CA, 1998.

[14] Hughes Network Systems. DirecPC http://www.direcpc.com, Jan, 2001.

Home

Page.

[15] J. Wong. “Broadcast Delivery”. In Proc. of the IEEE, 76(12), 1988.

[2] M. Franklin, S. Zdonik, "Dissemination-Based Information Systems ", IEEE Data Engineering Bulletin, 19(3), September, 1996.

[16] Dykeman, H. D., M. H. Ammar, and J. W. Wong, "Scheduling Algorithms for Videotex Systems under Broadcast Delivery", Proc ICC'86, 1986, pp. 1847-1851.

[3] Swarup Acharya, Rafael Alonso, Michael Franklin, and Stanley Zdonik. "Broadcast Disks: Data Management for Asymmetric Communication Environments". In Proceedings of ACM SIGMOD Conference, CA, 1995.

[17] N. Vaidya and S. Hameed. “Data broadcast in asymmetric wireless environments”. In First International Workshop on Satellite-based Information Services (WOSBIS), 1996.

[4] S. Acharya and S. Muthukrishnan. Scheduling on-demand data broadcasts: New metrics and algorithms. In Proc. of Fourth Annual ACM/IEEE International Conference on Mobile Computing and Networking, 1998.

[18] Nitin Prabhu, Debopham Acharya, Vijay Kumar “ Discovering and using web services in M-commerce”. In the proceedings of TES’04. [19] E. Pitoura, P. K. Chrysanthis and K. Ramamritham. "Characterizing the Temporal and Semantic Coherency of Broadcast-based Data Dissemination." (To appear in) Proc. of the International Conference on Database Theory , January 2003

[5] Vincenzo Liberatore, "Multicast Scheduling for List Requests", Proceedings of IEEE Infocom, CA, 2002. [6] Imielinski, S. Viswanathan, and B. R. Badrinath. "Energy Efficient Indexing On Air". In Proceedings of ACM SIGMOD Conference, 1994.

56