Energy Efficient Mobile Computation Offloading via Online Prefetching Seung-Woo Ko∗ , Kaibin Huang∗ , Seong-Lyun Kim† and Hyukjin Chae‡ ∗ Dept.

of EEE, The University of Hong Kong, Hong Kong of EEE, Yonsei University, S. Korea ‡ LG Electronics, S. Korea Email: [email protected]

† School

Abstract—Conventional mobile computation offloading relies on offline prefetching that fetches user-specific data to the cloud prior to computing. For computing depending on real-time inputs, the offline operation can result in fetching large volumes of redundant data over wireless channels and unnecessarily consumes mobile-transmission energy. To address this issue, we propose the novel technique of online prefetching for a large-scale program with numerous tasks, which seamlessly integrates tasklevel computation prediction and real-time prefetching within the program runtime. The technique not only reduces mobile-energy consumption by avoiding excessive fetching but also shortens the program runtime by parallel fetching and computing enabled by prediction. By modeling the sequential task transition in an offloaded program as a Markov chain, stochastic optimization is applied to design the online-fetching policies to minimize mobileenergy consumption for transmitting fetched data over fading channels under a deadline constraint. The optimal policies for slow and fast fading are shown to have a similar thresholdbased structure that selects candidates for the next task by applying a threshold on their likelihoods and furthermore uses them controlling the corresponding sizes of prefetched data. In addition, computation prediction for online prefetching is shown theoretically to always achieve energy reduction. Index Terms—Online prefetching, prefetching gain, task-level prediction, threshold-based structure, stochastic optimization.

I. I NTRODUCTION Mobile computation offloading (MCO) refers to the offloading of computation intensive tasks from mobile devices to the cloud, which lengthens their battery lives and enhances their computation capabilities [1]. The conventional MCO approach fetches the required data to the cloud prior to computing, called offline prefetching. However, the approach is no longer efficient for offloading emerging next-generation mobile applications (e.g., virtual reality and intelligent content delivery) that are highly sophisticated and have the capabilities of adapting to dynamic environments and human behaviors. The user-specific data they need depends on real-time inputs and offline prefetching can result in transmitting large-volumes of redundant data by mobiles, thereby significantly shortening the battery lives. To address this issue, we propose a novel technique called online prefetching for large-scale mobile programs comprising numerous tasks, which prefetches the needed data on a task-by-task basis but not for the whole program and thus minimizes redundant mobile transmissions. With cloud computing becoming a key feature of wireless networks, recent research on MCO adopts a interdisciplinary

approach integrating techniques from mobile computing and wireless communications. The decision policies have been derived for optimal MCO control building on underpinning algorithms of energy-efficient transmissions and CPU-frequency control [2]. The design has been extended to cope with energy randomness for mobiles powered by either wireless power transfer [3] or energy harvesting [4]. Optimal MCO has been studied for different implementation and architectures [5]–[7]. For cellular MCO systems with multi-users or multi-cells, the allocation of computation and radio resources need be jointly designed and the optimal polices for centralized allocation have been studied in [5], [6]. Alternatively, MCO can be based on a distributed mobile control that has been designed in [7] using game theory. The architecture of multiple heterogeneous clouds is considered in [8] where the MCO algorithm is designed to enable cooperative and cost-efficient resource sharing between the clouds to ensure quality-of-service for mobiles. The prior work all assumes offline prefetching which is based on a simplified model of wirelessly transmitting a controllable amount of data. Explicitly considering offline prefetching for specific applications have led to the development of more sophisticated techniques such as prefetching for low-latency MCO downlink [9] and opportunistic fetching of multimedia data [10]. However, as mentioned, offline prefetching is impractical for next-generation highly adaptive mobile applications with real-time inputs due to the difficulty in long-range computation prediction. This motivates the design of the online prefetching technique in this paper that leverages short-range (task-by-task) prediction to reduce mobile transmission-energy consumption by avoiding fetching redundant data, lengthening prefetching durations as well as exploiting opportunistic transmissions. For exposition, we consider a single-user MCO system where an access point connected to the cloud fetches user-specific data from a mobile to run an offloaded program comprising a large set of potential tasks, called the task space. The tasks are assumed to be unknown to the cloud till their executions and modeled as a sample path over a Markov chain in the task space. During the execution of a particular task, an online prefetcher in the cloud dynamically identifies a set of candidates for the subsequent task with given likelihoods and the sizes of required user-specific data, and fetches parts of their input data. The remaining input data of the subsequent

Mobile

: wireless link : wired link Offloading user-specific data

Remote execution at the cloud sever

Receiving the execution result from the cloud

Mobile

Task K-1

Task K

Demand-fetching

Prefetching NP Task K-1

Cloud server

Task K Result Demand-fetching downloading ND NI Task K-1

Task K

Computation offloading requirement N

Access point

: Transmission

Cloud server

Task K+1 Prefetching

Time

: Execution

(a) Online-prefetching architecture Fig. 1.

A single-user MCO system.

task is fetched after it is determined, called demand-fetching. Based on the model, we aim at optimizing the prefetching policies that select tasks for prefetching and controls the corresponding prefetched data sizes under the criterion of minimum mobile energy consumption over wireless fading channels. First, the optimal prefetching policy is derived over slow fading channels where the channel gain stays constant throughout the program runtime. The policy has a thresholdbased structure where the user-specific data for a candidate for the next task is prefetched only if its likelihood exceeds a given threshold and the prefetched data size is a monotone increasing function of the likelihood. Next, we consider fast fading where the channel gain is independent and identically distributed (i.i.d.) over slots dividing each fetching duration. Given a non-causal channel state information (CSI) case where the channels in the runtime are predictable, the optimal policy is derived in closed form having a similar threshold-based structure as the slow-fading counterpart. We also consider the case of causal CSI in the full paper [11]. Last, the prefetching gain, defined as the energy ratio between the cases without and with prefetching, is shown to be larger than one. II. S YSTEM M ODEL Consider the MCO system shown in Fig. 1 comprising a mobile and an access point connected to a cloud. The mobile attempts to execute a program comprising multiple tasks. These tasks have stringent deadlines such that they cannot be executed at the mobile due to its limited computation capabilities. The mobile needs to offload the tasks to the cloud and thereby the program is remotely executed in the cloud. The mobile thus transmits data to and receive computation output from the cloud via the access point. The channel is modeled as follows. The channel gain in slot n is denoted as gn with gn > 0. We consider both slow and fast fading for the wireless channel. For slow fading, the channel gain is constant denoted as g: gn = g for all n. Next, fast fading is modeled as block fading where channel gains are constant over one time slot and i.i.d. over different slots. Perfect CSI is assumed at both the mobile and the access point. For fading fading, perfect non-causal CSI over the prefetching duration is assumed be acquired by channel prediction. Following the models in [2], the energy consumption for transmitting bn bits to the cloud in slot n is modeled using a convex monomial function, denoted as E, as follows: (bn )m (1) E(bn , gn ) = λ gn where the constants m and λ represent the monomial order and the energy coefficient, respectively. The monomial order

Mobile

Task K-1 Demand-fetching

Cloud server

Task K Waiting a result

Task K-1

Result Demand-fetching downloading ND NI Task K-1

Computation offloading requirement N : Transmission

Task K Time

: Execution

(b) Conventional fetching architecture Fig. 2. MCO operation with prefetching (a) MCO with online-prefetching. (b) Conventional MCO with only demand fetching.

m is a positive integer depending on the specific modulationand-coding scheme. III. O NLINE P REFETCHING A. Architecture The computation at the cloud requires to fetch user-specific data (e.g., images, videos, and measured sensor values) from a mobile to the cloud, collectively called task data (TD). Depending on the instant when TD is delivered to the cloud, two fetching schemes are considered. One is prefetching when the mobile offloads part of its TD during the computation period of the previous task. The other is demand-fetching when the mobile offloads un-prefetched TD of the task after the completion of the preceding task. The proposed online prefetching and the conventional fetching architecture are compared in Fig 2. Let (K − 1) and K denote the indices of the current and next tasks, respectively. Time is divided into slots. The duration is divided into three sequential phases: NP slots for prefetching of Task K and computing Task (K − 1), NI for downloading computation results to the mobile, and ND for demand-fetching of Task K. In contrast, in the conventional architecture without prefetching, fetching relies only on demand-fetching as shown in Fig. 2(b). Comparing Fig. 2(a) and Fig. 2(b), it is observed that prefetching lengthens the fetching duration from ND slots to (NP + ND ) slots, thereby reducing the transmissionenergy consumption. Last, a latency constraint is applied such that the considered duration is no longer than N slots i.e., NP + ND + NI ≤ N . We assume that the access point can deliver the execution result within an extremely short interval (NI ≈ 0), following that the demand-fetching period of ND slots is approximately equal to (N − NP ) slots. The optimal prefetching policy depends on the specified sequence of executed tasks, which is unknown in advance and modeled as a stochastic sequence using the task-transition graph shown in Fig. 3. The states of the graph correspond to potential tasks and the arrows are feasible sequential transitions from tasks to tasks of which the weights give the transition probabilities. Consider the current state being the execution of Task (K − 1). There exist L candidate tasks

…

Task K as CT `, the total number of bits for demand fetching is obtained as β(`) = γ(`)−α(`). Then the design is translated into the problem of optimal transmission over (N − NP ) slots for maximizing the energy efficiency as

Task K-1

Task K’s candidate p(1) CT 1

p(2) CT 2 =Task K

p(L)

p(3) CT 3

…

min

CT L

…

Fig. 3.

Task execution graph.

(CTs) for Task K with correspondingPtransition probabilities L denoted as p(1), p(2), · · · , p(L) and `=1 p(`) = 1. Among the CTs, one is selected as Task K upon the completion of Task (K − 1). At the current state, however, the realization of Task K is not yet known. The online prefetching policy is designed in the sequel to maximize the utility of prefetched data. Given the current state being Task (K − 1), let γ(`) denote the size of TD (in bits) if CT ` is selected as Task K. We assume that TD is for exclusive use, namely that TD of CTP ` is useless for other CTs. The total L TD satisfies the constraint `=1 γ(`) = Γ, where the constant Γ represents the sum of TDs for all CTs. The TD for CT ` is divided into prefetching and demand-fetching parts, denoted as α(`) and β(`), respectively: γ(`) = α(`) + β(`).

(2)

During the first P NP slots, the mobile performs prefetching L by transmitting `=1 α(`) bits. During the remaining ND slots, the mobile performs demand-fetching by transmitting β(`) bits, given that CT ` is chosen as Task K. B. Problem Formulation Let α = [α(1), · · · , α(L)] and β = [β(1), · · · , β(L)] respectively denote the prefetching and demand-fetching vectors, where α+β = γ = [γ(1), · · · , γ(L)]. Recall that bn indicates the number of bits mobile transmits to the cloud in slot PNthe PL P n. Given α and n=1 bn = `=1 α(`), the expected energy consumption of prefetching, denoted by EP , is given as "N # P X EP (α) = Eg E(bn , gn ) (3) n=1

with the energy function E given in (1). For demand-fetching, one of the CTs, say CT `, is chosen as Task K. This results in the energy consumption of demand-fetching, denoted as ED , given as " N # X ED (β(`)) = Eg E(bn , gn ) (4) n=NP +1

PN

ED (β(`))

{bNP +1 ,··· ,bN }

with n=NP +1 bn = β(`). The online-prefetching design is formulated as a two-stage stochastic optimization problem under the criterion of maximum energy efficiency. In the first-stage, consider the design of optimal demand-fetcher conditioned on the prefetching decisions. Given the prefetching vector α and the realization of

(P1)

PN with n=NP +1 bn = β(`). ∗ Let ED (β(`)) represent the solution for Problem P1. In the second-stage, the prefetching policy is optimized to minimize the overall energy consumption for fetching. The corresponding optimization problem is formulated as follows: min

α, β

s.t.

EP (α) +

L X

∗ ED (β(`))p(`)

`=1

α + β = γ,

(P2)

α, β ≥ 0.

The optimal online prefetching policy is designed in the sequel by solving Problems P1 and P2. IV. O NLINE P REFETCHING OVER S LOW FADING C HANNELS This section aims at designing the optimal online prefetching policy over slow fading channels where channel gains remain constant over the fetching period of N slots. We derive the mathematical expression of the optimal prefetching vector α∗ and quantify the prefetching gain. A. Optimal Prefetching Policy for Slow Fading To facilitate the policy derivation for slow fading, two lemmas are provides as follows. Lemma 1. Given slow fading, equal-bit allocation over the slots for demand fetching, namely bNP +1 = · · · = bN = β(`) N −NP , is optimal and solves Problem P1. The resultant energy consumption for demand fetching is λ β(`)m · . (5) g (N − NP )m−1 Sketch of Proof: A lower bound on ED in (4) is derived by the inequality of arithmetic and geometric means that is shown to be achievable when {bn } are equal. Lemma 2. Given slow fading, Problem P2 is equivalent to: P m L PL m α(`) `=1 `=1 p(`)(γ(`) − α(`)) min + (P3) α (NP )m−1 (N − NP )m−1 ∗ ED (β(`)) =

s.t. 0 ≤ α ≤ γ. Sketch of Proof: Substituting the objective function of P2 into (5) of Lemma 1 and using the inequality of arithmetic and geometric means yield the desired result. Lemma 2 shows that the offloaded bits bn are evenly distributed over multiple slots, removing the variable β in P2 and thereby allowing tractable policy analysis. The main result is shown in the following proposition. Proposition 1. (Optimal Prefetching for Slow Fading) Given slow fading, the optimal prefetching vector α∗ = [α∗ (1), · · · , α∗ (L)] is + 1 N − NP ∗ − m−1 ∗ αΣ , (6) α = γ−p NP

where p = [p(1), p(2), · · · , p(L)] is the task-transition proba∗ bility vector and αΣ is the optimal total number of prefetched PL ∗ bits, αΣ = `=1 α∗ (`). Sketch of Proof: Consider Problem P3 and define a Lagrangian function with Lagrangian multipliers {µ(`)} associated with constraint (2). Using the slackness condition, the optimal multiplier µ(`) is proved to be zero for all `, yielding the desired result. The optimal prefetching vector α∗ in (6) determines the prefetching-task set S defined in Definition 1 and vice versa. Exploiting this relation, α∗ and S can be computed using the simple iterative algorithm presented in Algorithm 1 where the needed prefetching-priority function is defined in Definition 2. The existence of α∗ (or equivalently S) is shown in Corollary 1. In addition, given α∗ and S, the optimal total ∗ prefetched bits αΣ is given as P `∈S γ(`) ∗ . (7) αΣ = 1 − m−1 N −NP P 1 + NP `∈S p(`) Definition 1. (Prefetching Task Set) The prefetching-task set, denoted as S, is defined as the set of prefetched CTs: S = {` ∈ N |α∗ (`) > 0, 0 ≤ ` ≤ L }, where the needed prefetching priority function δ(·) is defined in Definition 2. Definition 2. (Prefetching Priority Function) The prefetching 1 priority function of CT ` is defined as δ(`) = γ(`)p(`) m−1 , which determines the prefetching order (see Algorithm 1). Corollary 1. (Existence of the Optimal Prefetching Vector) A unique optimal prefetching vector α∗ satisying (6) always exists in the range of 0 α∗ γ 1 where the first and second equalities hold only when N → ∞ or N = NP , respectively. Sketch of Proof: First, it is obvious that α∗ = γ when N = NP . Consider N > NP . It is easily proved that α∗ is ∗ from a continuous and monotone decreasing function of αΣ o n 1 N ∗ P max` γ(`)p(`) m−1 N −NP to 0 whereas αΣ is a continuous and monotone increasing function of α∗ from 0 to Γ. Thus, a unique optimal prefetching vector α∗ should exists. Remark 1. (Prefetched or Not?) Corollary 1 shows that the optimal prefetching vector α∗ is strictly positive if N is finite. In other words, prefetching in a slow fading channel is always beneficial to a computation task requiring a finite latency constraint. For a latency-tolerant task (N → ∞), a mobile needs not perform prefetching (α∗ = 0). Remark 2. (Partial or Full Prefetching?) Corollary 1 shows that if N > NP , the optimal prefetching vector α∗ is strictly less than γ because the remaining number of bits β(`) can be delivered during the demand-fetching duration (N − NP ). If N = NP , on the other hand, it is obvious that full prefetching (α∗ = γ) is optimal because only prefetching is possible. B. Prefetching Gain for Slow Fading In order to quantify how much prefetching increases the energy efficiency, we define the prefetching gain as follows. 1 The symbols in Corollary 1 and in Algorithm 1 denote the elementwise inequalities.

Algorithm 1 Finding the optimal prefetching vector and prefetching-task set for slow fading 1: Arranging the CTs in a descending order in terms of the prefetching-priority function δ in Definition 2, e.g. δ(`1 ) ≥ δ(`2 ), if and only if `1 > `2 . 2: Setting ` = 0 and S = ∅. 3: while |S| < L do S 4: ` = ` + 1 and S = S {`}. ∗ 5: Compute αΣ and α∗ using (7) and (6), respectively. 6: Count the number of positive elements in α∗ , namely ∗ |α 0|. 7: if |α∗ 0| = |S| then 8: break 9: end if 10: end while 11: return α∗ and S.

Definition 3. (Prefetching Gain) A prefetching gain GP is defined as the energy-consumption ratio between MCOs without and with prefetching, PL ∗ (γ(`))p(`) ED `=1P (8) GP = ∗ (β(`)) . EP (α∗ ) + `∈S p(`)ED The prefetching gain GP depends on several factors including the latency requirement N , prefetching duration NP , and the number of CTs L. The following result specifies the relationship mathematically. Proposition 2. (Prefetching Gain for Slow Fading). Given slow fading, the prefetching gain is !m−1 m N − NP (1 − L− m−1 ) GP ≥ , (9) N − NP Γ and p(`) = L1 for all `. where the equality holds if γ(`) = L Sketch of Proof: The denominator of GP in (8), the energy consumption of MCO without prefetching, is minimized when Γ γ(`) = L for all `. On the other hand, the numerator, the energy consumption of MCO with prefetching, is maximized at the same setting of the denominator. The lower bound of the prefetching gain GP is thus derived from the setting. Remark 3. (Effects of Prefetching Parameters) The prefetching gain GP in (9) shows the effects of different parameters. On one hand, prefetching lengthens the fetching duration from (N − NP ) to N slots, reducing the transmission rate and thus the resultant energy consumption. On the other hand, the m mobile pays for the cost of prefetching NP (1 − L− m−1 ) due to transmitting redundant bits. The gain is larger than the cost and hence prefetching is beneficial. V. O NLINE P REFETCHING OVER FAST FADING C HANNELS

In this section, we derive the optimal prefetching policy that solves Problem P2 for fast fading where channel gains are i.i.d. over different slots. To this end, we firstly derive the optimal demand-fetching policy by solving Problem P1, and solve Problem P2 by using the technique of convex and stochastic optimizations.

A. Optimal Demand-Fetching Policy for Fast Fading The optimal demand-fetching policy is derived by solving Problem P1 as follows. The demand-fetcher allows the mobile to transmit β(`) = γ(`) − α(`) bits from slot (NP + 1) to N . Designing the optimal demand-fetcher in Problem P1 for fast fading is thus equivalent to finding the optimal non-causal scheduler to deliver β(`) bits within (N − NP ) slots. Proposition 3. (Optimal Demand-Fetching for Fast Fading) Given fast fading and the selected task, Task K, being CT `, the optimal bit allocation over the demand-fetching slots is 1

b∗n

β(`)(gn ) m−1

= PN

k=NP +1 (gk )

1 m−1

n = NP + 1, · · · , N. (10)

,

The corresponding expected energy consumption is m

∗ ED (β(`)) = λ P

β(`)

1 N m−1 k=NP +1 (gk )

m−1 .

(11)

Sketch of Proof: Consider Problem P1 and define a Lagrangian function a Lagrangian multiplier τ associated Pwith N with constraint n=NP +1 bn = β(`). Using the slackness condition, it is proved that the optimal demand-fetched bits in slot n, b∗n , is always strictly positive. Given τ , we can derive b∗n (τ ) by setting the partial derivative of the Lagrangian function as zero. The optimal Lagragian multiplier τ ∗ can be derived by plugging the b∗n (τ ) into the above constraint. B. Optimal Online-Prefetching for Fast Fading To facilitate the policy derivation for fast fading, we substitute (11) of Proposition 3 into the objective function of P2, yielding to the following lemma. Lemma 3. Given fast fading, Problem P2 is equivalent to: min λ

α,{bn }

s.t.

NP L X X (bn )m p(`)(γ(`) − α(`))m +λ P m−1 1 gn N m−1 n=1 `=1 (g ) k k=NP +1

NP X n=1

bn =

L X

(P4)

α(`).

`=1

Lemma 3 shows that the prefetching vector α should be jointly optimized with the numbers of bits allocated over the prefetching slots {bn } that varies with the channel gain {gn }. The main result is shown in the following proposition. Proposition 4. (Optimal Prefetching for Fast Fading) Given fast fading, the optimal prefetching vector α∗ and the optimal bit allocation over the prefetching slots b∗n are respectively " #+ PN 1 m−1 1 k=NP +1 (gk ) − m−1 ∗ ∗ α = γ−p αΣ , (12) PNP 1 m−1 k=1 (gk ) 1

α∗ (gn ) m−1 b∗n = PNΣ , 1 P m−1 k=1 (gk )

n = 1, · · · , NP ,

(13)

∗ where the optimal total number of prefetched bits αΣ = PL ∗ α (`) is given as `=1 P `∈S γ(`) ∗ αΣ = (14) 1 PN . m−1 P 1 k=NP +1 (gk ) − m−1 1 + PN 1 `∈S p(`) P m−1 k=1 (gk )

Sketch of Proof: Consider Problem P4 and define a Lagrangian function a Lagrangian multiplier ρ associated PL PNwith P bn = `=1 α(`). Given ρ, the optimal with constraint n=1 prefetching vector α∗ (ρ) and the optimal bit allocation in slot n, b∗n (ρ) can be derived by following the same steps in Propositions 1 and 3, respectively. Lastly, the optimal muliplier ρ∗ can be obtained using the above constraint. Remark 4. (Threshold Based Structure) Comparing Propositions 1 and 4, both of the optimal prefetching policies for slow and fast fading have an identical threshold based structure where CT ` is prefetched only when its size of TD γ(`) exceeds the specified the threshold, which is proportional to ∗ the optimal total prefetched bits αΣ . Then the prefetching properties for slow fading also hold for fading fading, including Algorithm 1 for finding the optimal prefetching task set S by replacing (6) and (7) with (12) and (14), respectively. C. Prefetching Gain for Fast Fading The prefetching gain defined in Definition 3 can be quantified for fast fading as follows. Proposition 5. (Prefetching Gain for Fast Fading). Given fast fading, the prefetching gain GP is PN GP ≥

k=1 (gk )

P m−1 m 1 NP m−1 − 1 − L m−1 k=1 (gk ) , PN 1 m−1 k=NP +1 (gk ) (15)

1 m−1

Γ where the equality holds if γ(`) = L and p(`) = L1 for all `. Sketch of Proof: It can be easily proved by the same steps in Proposition 2. Corollary 2. (Comparison between Slow and Fast Fading) The expected lower bound of prefetching gain GP for fast fading in (15) is strictly larger than the lower bound of prefetching gain for slow fading in (9). Sketch of Proof: The lower bound of E [GP ] is derived as

m−1 PNP 1 m m−1 m−1 (g ) 1 k P k=1 E [GP ] ≥E 1 + 1 N L m−1 k=NP +1 (gk ) hP i m−1 1 NP m m−1 E (g ) k m−1 (a) k=1 1 hP i > 1 + 1 N L m−1 E (g ) k=NP +1 k

where (a) follows from Jensen’s inequality. Noting that the last equation becomes equivalent to (9), and we have the result. Remark 5. (Effects of Fast Fading on the Prefetching Gain) The prefetching gain GP for fast fading is always larger than the slow-fading counterpart. This suggests that fasting fading enables opportunistic prefetching that exploits channel temporal diversity for enhancing the gain. VI. S IMULATION R ESULTS Simulation results are presented for evaluating the performance of MCO with online prefetching. In the simulation, the channel gain gn follows the Gamma distribution with the shape parameter k > 1 and the probability density function fg (x) = R∞ xk−1 e−kx where the Gamma function Γ(k) = 0 xk−1 e−x dx (1/k)k Γ(k) and the mean E [gn ] = 1. The parameter is set as k = 2.

30 25 20 15 10

w/ prefetching (slow fading)

5

w/o prefetching (slow fading)

0

w/ prefetching (fast fading)

-5

w/o prefetching (fast fading)

-10

10 1

10 2

Size of task data Γ (bits)

Expected energy consumption (dB)

Expected energy consumption (dB)

35

30

25

20

15

w/ prefetching (slow fading) w/o prefetching (slow fading) w/ prefetching (fast fading) w/o prefetching (fast fading)

10

5

0 10 0

(b) Expected energy consumption vs. number of CTs

w/ prefetching (slow fading) w/o prefetching (slow fading)

16

w/ prefetching (fast fading) w/o prefetching (fast fading)

14 12 10 8 6

5

6

7

8

9

10

11

12

13

Latency requirement N (slots)

(c) Expected energy consumption vs. latency requirement

Expected energy consumption (dB)

Expected energy consumption (dB)

(a) Expected energy consumption vs. TD size 20 18

10 1

Number of candidate tasks L

20

w/ prefetching (slow fading)

18

w/o prefetching (slow fading) 16

w/ prefetching (fast fading)

14

w/o prefetching (fast fading)

12 10 8 6

0

1

2

3

4

5

6

7

8

9

Prefetching duration NP (slots)

(d) Expected energy consumption vs. prefetching duration

Fig. 4. The effects of MCO parameters on the (mobile) expected energy consumption for the cases of slow and fast fading: (a) TD size Γ with NP = 4, N = 5 and L = 4; (b) the number of CTs L with NP = 4, N = 5 and Γ = 20; (c) the latency requirement N (in slot) with NP = 4, L = 4 and Γ = 20; (d) the prefetching duration NP (in slot) with N = 10, L = 4 and Γ = 20.

The monomial order and the energy coefficient of the energy consumption model in (1) are set as m = 2 and λ = 1, respectively. The transition probabilities {p(`)} and the data size {γ(`)} are generated based the uniform distribution. The curves of the expected energy consumption (in dB) are plotted against different MCO parameters in separate subfigures of Fig. 4. Since the vertical axes of the figures are on the logarithmic scale, the vertical gaps between the curves measure the expected prefetching gain E[GP ] over the random task sets. First, one can observe that the prefetching gain is always larger than one in all sub-figures. Second, the prefetching gain of fast fading is always larger than that of slow fading. Third, with both axes on the logarithmic scale, the straight lines in Fig. 4(a) imply that the expected energy consumption increases as a monomial of the TD size, which agrees with energy consumption model in (1). Fourth, the expected prefetching gain is observed from Fig. 4(b) to be more than 5 dB when the number of CTs L is small but the gain diminishes as L increases Fifth, Fig. 4(c) shows that the prefetching gain diminishes as the latency requirement is relaxed.In other words, prefetching is beneficial in scenarios with relatively stringent latency requirements. Last, the gain grows with the prefetching duration in Fig. 4(d). VII. C ONCLUSION A novel architecture of online prefetching for mobile computation offloading has been proposed to enable prefetching based on task-level computation prediction and its simultaneous operation with cloud computing. Given stochastic sequential tasks, the optimal prefetching polices have been designed to minimize mobile energy consumption over fading channels under the assumption of non-causal CSI. The simple threshold based structure of the policies is derived, which enables lowcomplexity online operation. Comprehensive simulation shows that online prefetching achieves significant mobile energy reduction than the conventional scheme without prefetching.

ACKNOWLEDGMENT This work was supported by LG Electronics, the National Research Foundation of Korea (NRF2014R1A2A1A11053234), and Institute for Information & communications Technology Promotion (2015-0-00294). R EFERENCES [1] K. Kumar and Y.-H. Lu, “Cloud computing for mobile users: Can offloading computation save energy?” IEEE Computer, vol. 43, no. 4, pp. 51–56, Apr. 2010. [2] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. Wu, “Energyoptimal mobile cloud computing under stochastic wireless channel,” IEEE Trans. Veh. Technol., vol. 12, no. 9, pp. 4569–4581, Sep. 2013. [3] C. You, K. Huang, and H. Chae, “Energy efficient mobile cloud computing powered by wireless energy transfer,” IEEE J. Sel. Areas Commun., vol. 34, no. 5, pp. 1757–1771, May 2016. [4] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloading for mobile-edge computing with energy harvesting devices,” IEEE J. Sel. Areas. Commun., vol. 34, no. 12, pp. 3590 – 3605, Dec. 2016. [5] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of radio and computational resources for multicell mobile-edge computing,” IEEE Trans. Signal Inf. Process. over Netw., vol. 1, no. 2, pp. 89–103, Jun. 2015. [6] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation for mobile-edge computation offloading,” to appear in IEEE Trans. Wireless Commun. [Online]. Available: http://ieeexplore.ieee.org/document/7762913/ [7] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation offloading for mobile-edge cloud computing,” IEEE Trans. Netw., vol. 24, no. 5, pp. 2795–2808, Oct. 2016. [8] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A framework for cooperative resource management in mobile cloud computing,” IEEE J. Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700, Dec. 2013. [9] P. Shu, F. Liu, H. Jin, M. Chen, F. Wen, and Y. Qu, “etime: Energyefficient transmission between cloud and mobile devices,” in Proc. IEEE Int. Conf. Comput. Commun., Turin, Italy, Apr. 2013, pp. 195–199. [10] N. Master, A. Dua, D. Tsamis, J. P. Singh, and N. Bambos, “Adaptive prefetching in wireless computing,” IEEE Trans. Wireless Commun., vol. 15, no. 5, pp. 3296–3310, May 2016. [11] S.-W. Ko, K. Huang, S.-L. Kim, and H. Chae, “Live prefetching for mobile computation offloading,” to appear in IEEE Trans. Wireless Commun. [Online]. Available: https://arxiv.org/abs/1608.04878

of EEE, The University of Hong Kong, Hong Kong of EEE, Yonsei University, S. Korea ‡ LG Electronics, S. Korea Email: [email protected]

† School

Abstract—Conventional mobile computation offloading relies on offline prefetching that fetches user-specific data to the cloud prior to computing. For computing depending on real-time inputs, the offline operation can result in fetching large volumes of redundant data over wireless channels and unnecessarily consumes mobile-transmission energy. To address this issue, we propose the novel technique of online prefetching for a large-scale program with numerous tasks, which seamlessly integrates tasklevel computation prediction and real-time prefetching within the program runtime. The technique not only reduces mobile-energy consumption by avoiding excessive fetching but also shortens the program runtime by parallel fetching and computing enabled by prediction. By modeling the sequential task transition in an offloaded program as a Markov chain, stochastic optimization is applied to design the online-fetching policies to minimize mobileenergy consumption for transmitting fetched data over fading channels under a deadline constraint. The optimal policies for slow and fast fading are shown to have a similar thresholdbased structure that selects candidates for the next task by applying a threshold on their likelihoods and furthermore uses them controlling the corresponding sizes of prefetched data. In addition, computation prediction for online prefetching is shown theoretically to always achieve energy reduction. Index Terms—Online prefetching, prefetching gain, task-level prediction, threshold-based structure, stochastic optimization.

I. I NTRODUCTION Mobile computation offloading (MCO) refers to the offloading of computation intensive tasks from mobile devices to the cloud, which lengthens their battery lives and enhances their computation capabilities [1]. The conventional MCO approach fetches the required data to the cloud prior to computing, called offline prefetching. However, the approach is no longer efficient for offloading emerging next-generation mobile applications (e.g., virtual reality and intelligent content delivery) that are highly sophisticated and have the capabilities of adapting to dynamic environments and human behaviors. The user-specific data they need depends on real-time inputs and offline prefetching can result in transmitting large-volumes of redundant data by mobiles, thereby significantly shortening the battery lives. To address this issue, we propose a novel technique called online prefetching for large-scale mobile programs comprising numerous tasks, which prefetches the needed data on a task-by-task basis but not for the whole program and thus minimizes redundant mobile transmissions. With cloud computing becoming a key feature of wireless networks, recent research on MCO adopts a interdisciplinary

approach integrating techniques from mobile computing and wireless communications. The decision policies have been derived for optimal MCO control building on underpinning algorithms of energy-efficient transmissions and CPU-frequency control [2]. The design has been extended to cope with energy randomness for mobiles powered by either wireless power transfer [3] or energy harvesting [4]. Optimal MCO has been studied for different implementation and architectures [5]–[7]. For cellular MCO systems with multi-users or multi-cells, the allocation of computation and radio resources need be jointly designed and the optimal polices for centralized allocation have been studied in [5], [6]. Alternatively, MCO can be based on a distributed mobile control that has been designed in [7] using game theory. The architecture of multiple heterogeneous clouds is considered in [8] where the MCO algorithm is designed to enable cooperative and cost-efficient resource sharing between the clouds to ensure quality-of-service for mobiles. The prior work all assumes offline prefetching which is based on a simplified model of wirelessly transmitting a controllable amount of data. Explicitly considering offline prefetching for specific applications have led to the development of more sophisticated techniques such as prefetching for low-latency MCO downlink [9] and opportunistic fetching of multimedia data [10]. However, as mentioned, offline prefetching is impractical for next-generation highly adaptive mobile applications with real-time inputs due to the difficulty in long-range computation prediction. This motivates the design of the online prefetching technique in this paper that leverages short-range (task-by-task) prediction to reduce mobile transmission-energy consumption by avoiding fetching redundant data, lengthening prefetching durations as well as exploiting opportunistic transmissions. For exposition, we consider a single-user MCO system where an access point connected to the cloud fetches user-specific data from a mobile to run an offloaded program comprising a large set of potential tasks, called the task space. The tasks are assumed to be unknown to the cloud till their executions and modeled as a sample path over a Markov chain in the task space. During the execution of a particular task, an online prefetcher in the cloud dynamically identifies a set of candidates for the subsequent task with given likelihoods and the sizes of required user-specific data, and fetches parts of their input data. The remaining input data of the subsequent

Mobile

: wireless link : wired link Offloading user-specific data

Remote execution at the cloud sever

Receiving the execution result from the cloud

Mobile

Task K-1

Task K

Demand-fetching

Prefetching NP Task K-1

Cloud server

Task K Result Demand-fetching downloading ND NI Task K-1

Task K

Computation offloading requirement N

Access point

: Transmission

Cloud server

Task K+1 Prefetching

Time

: Execution

(a) Online-prefetching architecture Fig. 1.

A single-user MCO system.

task is fetched after it is determined, called demand-fetching. Based on the model, we aim at optimizing the prefetching policies that select tasks for prefetching and controls the corresponding prefetched data sizes under the criterion of minimum mobile energy consumption over wireless fading channels. First, the optimal prefetching policy is derived over slow fading channels where the channel gain stays constant throughout the program runtime. The policy has a thresholdbased structure where the user-specific data for a candidate for the next task is prefetched only if its likelihood exceeds a given threshold and the prefetched data size is a monotone increasing function of the likelihood. Next, we consider fast fading where the channel gain is independent and identically distributed (i.i.d.) over slots dividing each fetching duration. Given a non-causal channel state information (CSI) case where the channels in the runtime are predictable, the optimal policy is derived in closed form having a similar threshold-based structure as the slow-fading counterpart. We also consider the case of causal CSI in the full paper [11]. Last, the prefetching gain, defined as the energy ratio between the cases without and with prefetching, is shown to be larger than one. II. S YSTEM M ODEL Consider the MCO system shown in Fig. 1 comprising a mobile and an access point connected to a cloud. The mobile attempts to execute a program comprising multiple tasks. These tasks have stringent deadlines such that they cannot be executed at the mobile due to its limited computation capabilities. The mobile needs to offload the tasks to the cloud and thereby the program is remotely executed in the cloud. The mobile thus transmits data to and receive computation output from the cloud via the access point. The channel is modeled as follows. The channel gain in slot n is denoted as gn with gn > 0. We consider both slow and fast fading for the wireless channel. For slow fading, the channel gain is constant denoted as g: gn = g for all n. Next, fast fading is modeled as block fading where channel gains are constant over one time slot and i.i.d. over different slots. Perfect CSI is assumed at both the mobile and the access point. For fading fading, perfect non-causal CSI over the prefetching duration is assumed be acquired by channel prediction. Following the models in [2], the energy consumption for transmitting bn bits to the cloud in slot n is modeled using a convex monomial function, denoted as E, as follows: (bn )m (1) E(bn , gn ) = λ gn where the constants m and λ represent the monomial order and the energy coefficient, respectively. The monomial order

Mobile

Task K-1 Demand-fetching

Cloud server

Task K Waiting a result

Task K-1

Result Demand-fetching downloading ND NI Task K-1

Computation offloading requirement N : Transmission

Task K Time

: Execution

(b) Conventional fetching architecture Fig. 2. MCO operation with prefetching (a) MCO with online-prefetching. (b) Conventional MCO with only demand fetching.

m is a positive integer depending on the specific modulationand-coding scheme. III. O NLINE P REFETCHING A. Architecture The computation at the cloud requires to fetch user-specific data (e.g., images, videos, and measured sensor values) from a mobile to the cloud, collectively called task data (TD). Depending on the instant when TD is delivered to the cloud, two fetching schemes are considered. One is prefetching when the mobile offloads part of its TD during the computation period of the previous task. The other is demand-fetching when the mobile offloads un-prefetched TD of the task after the completion of the preceding task. The proposed online prefetching and the conventional fetching architecture are compared in Fig 2. Let (K − 1) and K denote the indices of the current and next tasks, respectively. Time is divided into slots. The duration is divided into three sequential phases: NP slots for prefetching of Task K and computing Task (K − 1), NI for downloading computation results to the mobile, and ND for demand-fetching of Task K. In contrast, in the conventional architecture without prefetching, fetching relies only on demand-fetching as shown in Fig. 2(b). Comparing Fig. 2(a) and Fig. 2(b), it is observed that prefetching lengthens the fetching duration from ND slots to (NP + ND ) slots, thereby reducing the transmissionenergy consumption. Last, a latency constraint is applied such that the considered duration is no longer than N slots i.e., NP + ND + NI ≤ N . We assume that the access point can deliver the execution result within an extremely short interval (NI ≈ 0), following that the demand-fetching period of ND slots is approximately equal to (N − NP ) slots. The optimal prefetching policy depends on the specified sequence of executed tasks, which is unknown in advance and modeled as a stochastic sequence using the task-transition graph shown in Fig. 3. The states of the graph correspond to potential tasks and the arrows are feasible sequential transitions from tasks to tasks of which the weights give the transition probabilities. Consider the current state being the execution of Task (K − 1). There exist L candidate tasks

…

Task K as CT `, the total number of bits for demand fetching is obtained as β(`) = γ(`)−α(`). Then the design is translated into the problem of optimal transmission over (N − NP ) slots for maximizing the energy efficiency as

Task K-1

Task K’s candidate p(1) CT 1

p(2) CT 2 =Task K

p(L)

p(3) CT 3

…

min

CT L

…

Fig. 3.

Task execution graph.

(CTs) for Task K with correspondingPtransition probabilities L denoted as p(1), p(2), · · · , p(L) and `=1 p(`) = 1. Among the CTs, one is selected as Task K upon the completion of Task (K − 1). At the current state, however, the realization of Task K is not yet known. The online prefetching policy is designed in the sequel to maximize the utility of prefetched data. Given the current state being Task (K − 1), let γ(`) denote the size of TD (in bits) if CT ` is selected as Task K. We assume that TD is for exclusive use, namely that TD of CTP ` is useless for other CTs. The total L TD satisfies the constraint `=1 γ(`) = Γ, where the constant Γ represents the sum of TDs for all CTs. The TD for CT ` is divided into prefetching and demand-fetching parts, denoted as α(`) and β(`), respectively: γ(`) = α(`) + β(`).

(2)

During the first P NP slots, the mobile performs prefetching L by transmitting `=1 α(`) bits. During the remaining ND slots, the mobile performs demand-fetching by transmitting β(`) bits, given that CT ` is chosen as Task K. B. Problem Formulation Let α = [α(1), · · · , α(L)] and β = [β(1), · · · , β(L)] respectively denote the prefetching and demand-fetching vectors, where α+β = γ = [γ(1), · · · , γ(L)]. Recall that bn indicates the number of bits mobile transmits to the cloud in slot PNthe PL P n. Given α and n=1 bn = `=1 α(`), the expected energy consumption of prefetching, denoted by EP , is given as "N # P X EP (α) = Eg E(bn , gn ) (3) n=1

with the energy function E given in (1). For demand-fetching, one of the CTs, say CT `, is chosen as Task K. This results in the energy consumption of demand-fetching, denoted as ED , given as " N # X ED (β(`)) = Eg E(bn , gn ) (4) n=NP +1

PN

ED (β(`))

{bNP +1 ,··· ,bN }

with n=NP +1 bn = β(`). The online-prefetching design is formulated as a two-stage stochastic optimization problem under the criterion of maximum energy efficiency. In the first-stage, consider the design of optimal demand-fetcher conditioned on the prefetching decisions. Given the prefetching vector α and the realization of

(P1)

PN with n=NP +1 bn = β(`). ∗ Let ED (β(`)) represent the solution for Problem P1. In the second-stage, the prefetching policy is optimized to minimize the overall energy consumption for fetching. The corresponding optimization problem is formulated as follows: min

α, β

s.t.

EP (α) +

L X

∗ ED (β(`))p(`)

`=1

α + β = γ,

(P2)

α, β ≥ 0.

The optimal online prefetching policy is designed in the sequel by solving Problems P1 and P2. IV. O NLINE P REFETCHING OVER S LOW FADING C HANNELS This section aims at designing the optimal online prefetching policy over slow fading channels where channel gains remain constant over the fetching period of N slots. We derive the mathematical expression of the optimal prefetching vector α∗ and quantify the prefetching gain. A. Optimal Prefetching Policy for Slow Fading To facilitate the policy derivation for slow fading, two lemmas are provides as follows. Lemma 1. Given slow fading, equal-bit allocation over the slots for demand fetching, namely bNP +1 = · · · = bN = β(`) N −NP , is optimal and solves Problem P1. The resultant energy consumption for demand fetching is λ β(`)m · . (5) g (N − NP )m−1 Sketch of Proof: A lower bound on ED in (4) is derived by the inequality of arithmetic and geometric means that is shown to be achievable when {bn } are equal. Lemma 2. Given slow fading, Problem P2 is equivalent to: P m L PL m α(`) `=1 `=1 p(`)(γ(`) − α(`)) min + (P3) α (NP )m−1 (N − NP )m−1 ∗ ED (β(`)) =

s.t. 0 ≤ α ≤ γ. Sketch of Proof: Substituting the objective function of P2 into (5) of Lemma 1 and using the inequality of arithmetic and geometric means yield the desired result. Lemma 2 shows that the offloaded bits bn are evenly distributed over multiple slots, removing the variable β in P2 and thereby allowing tractable policy analysis. The main result is shown in the following proposition. Proposition 1. (Optimal Prefetching for Slow Fading) Given slow fading, the optimal prefetching vector α∗ = [α∗ (1), · · · , α∗ (L)] is + 1 N − NP ∗ − m−1 ∗ αΣ , (6) α = γ−p NP

where p = [p(1), p(2), · · · , p(L)] is the task-transition proba∗ bility vector and αΣ is the optimal total number of prefetched PL ∗ bits, αΣ = `=1 α∗ (`). Sketch of Proof: Consider Problem P3 and define a Lagrangian function with Lagrangian multipliers {µ(`)} associated with constraint (2). Using the slackness condition, the optimal multiplier µ(`) is proved to be zero for all `, yielding the desired result. The optimal prefetching vector α∗ in (6) determines the prefetching-task set S defined in Definition 1 and vice versa. Exploiting this relation, α∗ and S can be computed using the simple iterative algorithm presented in Algorithm 1 where the needed prefetching-priority function is defined in Definition 2. The existence of α∗ (or equivalently S) is shown in Corollary 1. In addition, given α∗ and S, the optimal total ∗ prefetched bits αΣ is given as P `∈S γ(`) ∗ . (7) αΣ = 1 − m−1 N −NP P 1 + NP `∈S p(`) Definition 1. (Prefetching Task Set) The prefetching-task set, denoted as S, is defined as the set of prefetched CTs: S = {` ∈ N |α∗ (`) > 0, 0 ≤ ` ≤ L }, where the needed prefetching priority function δ(·) is defined in Definition 2. Definition 2. (Prefetching Priority Function) The prefetching 1 priority function of CT ` is defined as δ(`) = γ(`)p(`) m−1 , which determines the prefetching order (see Algorithm 1). Corollary 1. (Existence of the Optimal Prefetching Vector) A unique optimal prefetching vector α∗ satisying (6) always exists in the range of 0 α∗ γ 1 where the first and second equalities hold only when N → ∞ or N = NP , respectively. Sketch of Proof: First, it is obvious that α∗ = γ when N = NP . Consider N > NP . It is easily proved that α∗ is ∗ from a continuous and monotone decreasing function of αΣ o n 1 N ∗ P max` γ(`)p(`) m−1 N −NP to 0 whereas αΣ is a continuous and monotone increasing function of α∗ from 0 to Γ. Thus, a unique optimal prefetching vector α∗ should exists. Remark 1. (Prefetched or Not?) Corollary 1 shows that the optimal prefetching vector α∗ is strictly positive if N is finite. In other words, prefetching in a slow fading channel is always beneficial to a computation task requiring a finite latency constraint. For a latency-tolerant task (N → ∞), a mobile needs not perform prefetching (α∗ = 0). Remark 2. (Partial or Full Prefetching?) Corollary 1 shows that if N > NP , the optimal prefetching vector α∗ is strictly less than γ because the remaining number of bits β(`) can be delivered during the demand-fetching duration (N − NP ). If N = NP , on the other hand, it is obvious that full prefetching (α∗ = γ) is optimal because only prefetching is possible. B. Prefetching Gain for Slow Fading In order to quantify how much prefetching increases the energy efficiency, we define the prefetching gain as follows. 1 The symbols in Corollary 1 and in Algorithm 1 denote the elementwise inequalities.

Algorithm 1 Finding the optimal prefetching vector and prefetching-task set for slow fading 1: Arranging the CTs in a descending order in terms of the prefetching-priority function δ in Definition 2, e.g. δ(`1 ) ≥ δ(`2 ), if and only if `1 > `2 . 2: Setting ` = 0 and S = ∅. 3: while |S| < L do S 4: ` = ` + 1 and S = S {`}. ∗ 5: Compute αΣ and α∗ using (7) and (6), respectively. 6: Count the number of positive elements in α∗ , namely ∗ |α 0|. 7: if |α∗ 0| = |S| then 8: break 9: end if 10: end while 11: return α∗ and S.

Definition 3. (Prefetching Gain) A prefetching gain GP is defined as the energy-consumption ratio between MCOs without and with prefetching, PL ∗ (γ(`))p(`) ED `=1P (8) GP = ∗ (β(`)) . EP (α∗ ) + `∈S p(`)ED The prefetching gain GP depends on several factors including the latency requirement N , prefetching duration NP , and the number of CTs L. The following result specifies the relationship mathematically. Proposition 2. (Prefetching Gain for Slow Fading). Given slow fading, the prefetching gain is !m−1 m N − NP (1 − L− m−1 ) GP ≥ , (9) N − NP Γ and p(`) = L1 for all `. where the equality holds if γ(`) = L Sketch of Proof: The denominator of GP in (8), the energy consumption of MCO without prefetching, is minimized when Γ γ(`) = L for all `. On the other hand, the numerator, the energy consumption of MCO with prefetching, is maximized at the same setting of the denominator. The lower bound of the prefetching gain GP is thus derived from the setting. Remark 3. (Effects of Prefetching Parameters) The prefetching gain GP in (9) shows the effects of different parameters. On one hand, prefetching lengthens the fetching duration from (N − NP ) to N slots, reducing the transmission rate and thus the resultant energy consumption. On the other hand, the m mobile pays for the cost of prefetching NP (1 − L− m−1 ) due to transmitting redundant bits. The gain is larger than the cost and hence prefetching is beneficial. V. O NLINE P REFETCHING OVER FAST FADING C HANNELS

In this section, we derive the optimal prefetching policy that solves Problem P2 for fast fading where channel gains are i.i.d. over different slots. To this end, we firstly derive the optimal demand-fetching policy by solving Problem P1, and solve Problem P2 by using the technique of convex and stochastic optimizations.

A. Optimal Demand-Fetching Policy for Fast Fading The optimal demand-fetching policy is derived by solving Problem P1 as follows. The demand-fetcher allows the mobile to transmit β(`) = γ(`) − α(`) bits from slot (NP + 1) to N . Designing the optimal demand-fetcher in Problem P1 for fast fading is thus equivalent to finding the optimal non-causal scheduler to deliver β(`) bits within (N − NP ) slots. Proposition 3. (Optimal Demand-Fetching for Fast Fading) Given fast fading and the selected task, Task K, being CT `, the optimal bit allocation over the demand-fetching slots is 1

b∗n

β(`)(gn ) m−1

= PN

k=NP +1 (gk )

1 m−1

n = NP + 1, · · · , N. (10)

,

The corresponding expected energy consumption is m

∗ ED (β(`)) = λ P

β(`)

1 N m−1 k=NP +1 (gk )

m−1 .

(11)

Sketch of Proof: Consider Problem P1 and define a Lagrangian function a Lagrangian multiplier τ associated Pwith N with constraint n=NP +1 bn = β(`). Using the slackness condition, it is proved that the optimal demand-fetched bits in slot n, b∗n , is always strictly positive. Given τ , we can derive b∗n (τ ) by setting the partial derivative of the Lagrangian function as zero. The optimal Lagragian multiplier τ ∗ can be derived by plugging the b∗n (τ ) into the above constraint. B. Optimal Online-Prefetching for Fast Fading To facilitate the policy derivation for fast fading, we substitute (11) of Proposition 3 into the objective function of P2, yielding to the following lemma. Lemma 3. Given fast fading, Problem P2 is equivalent to: min λ

α,{bn }

s.t.

NP L X X (bn )m p(`)(γ(`) − α(`))m +λ P m−1 1 gn N m−1 n=1 `=1 (g ) k k=NP +1

NP X n=1

bn =

L X

(P4)

α(`).

`=1

Lemma 3 shows that the prefetching vector α should be jointly optimized with the numbers of bits allocated over the prefetching slots {bn } that varies with the channel gain {gn }. The main result is shown in the following proposition. Proposition 4. (Optimal Prefetching for Fast Fading) Given fast fading, the optimal prefetching vector α∗ and the optimal bit allocation over the prefetching slots b∗n are respectively " #+ PN 1 m−1 1 k=NP +1 (gk ) − m−1 ∗ ∗ α = γ−p αΣ , (12) PNP 1 m−1 k=1 (gk ) 1

α∗ (gn ) m−1 b∗n = PNΣ , 1 P m−1 k=1 (gk )

n = 1, · · · , NP ,

(13)

∗ where the optimal total number of prefetched bits αΣ = PL ∗ α (`) is given as `=1 P `∈S γ(`) ∗ αΣ = (14) 1 PN . m−1 P 1 k=NP +1 (gk ) − m−1 1 + PN 1 `∈S p(`) P m−1 k=1 (gk )

Sketch of Proof: Consider Problem P4 and define a Lagrangian function a Lagrangian multiplier ρ associated PL PNwith P bn = `=1 α(`). Given ρ, the optimal with constraint n=1 prefetching vector α∗ (ρ) and the optimal bit allocation in slot n, b∗n (ρ) can be derived by following the same steps in Propositions 1 and 3, respectively. Lastly, the optimal muliplier ρ∗ can be obtained using the above constraint. Remark 4. (Threshold Based Structure) Comparing Propositions 1 and 4, both of the optimal prefetching policies for slow and fast fading have an identical threshold based structure where CT ` is prefetched only when its size of TD γ(`) exceeds the specified the threshold, which is proportional to ∗ the optimal total prefetched bits αΣ . Then the prefetching properties for slow fading also hold for fading fading, including Algorithm 1 for finding the optimal prefetching task set S by replacing (6) and (7) with (12) and (14), respectively. C. Prefetching Gain for Fast Fading The prefetching gain defined in Definition 3 can be quantified for fast fading as follows. Proposition 5. (Prefetching Gain for Fast Fading). Given fast fading, the prefetching gain GP is PN GP ≥

k=1 (gk )

P m−1 m 1 NP m−1 − 1 − L m−1 k=1 (gk ) , PN 1 m−1 k=NP +1 (gk ) (15)

1 m−1

Γ where the equality holds if γ(`) = L and p(`) = L1 for all `. Sketch of Proof: It can be easily proved by the same steps in Proposition 2. Corollary 2. (Comparison between Slow and Fast Fading) The expected lower bound of prefetching gain GP for fast fading in (15) is strictly larger than the lower bound of prefetching gain for slow fading in (9). Sketch of Proof: The lower bound of E [GP ] is derived as

m−1 PNP 1 m m−1 m−1 (g ) 1 k P k=1 E [GP ] ≥E 1 + 1 N L m−1 k=NP +1 (gk ) hP i m−1 1 NP m m−1 E (g ) k m−1 (a) k=1 1 hP i > 1 + 1 N L m−1 E (g ) k=NP +1 k

where (a) follows from Jensen’s inequality. Noting that the last equation becomes equivalent to (9), and we have the result. Remark 5. (Effects of Fast Fading on the Prefetching Gain) The prefetching gain GP for fast fading is always larger than the slow-fading counterpart. This suggests that fasting fading enables opportunistic prefetching that exploits channel temporal diversity for enhancing the gain. VI. S IMULATION R ESULTS Simulation results are presented for evaluating the performance of MCO with online prefetching. In the simulation, the channel gain gn follows the Gamma distribution with the shape parameter k > 1 and the probability density function fg (x) = R∞ xk−1 e−kx where the Gamma function Γ(k) = 0 xk−1 e−x dx (1/k)k Γ(k) and the mean E [gn ] = 1. The parameter is set as k = 2.

30 25 20 15 10

w/ prefetching (slow fading)

5

w/o prefetching (slow fading)

0

w/ prefetching (fast fading)

-5

w/o prefetching (fast fading)

-10

10 1

10 2

Size of task data Γ (bits)

Expected energy consumption (dB)

Expected energy consumption (dB)

35

30

25

20

15

w/ prefetching (slow fading) w/o prefetching (slow fading) w/ prefetching (fast fading) w/o prefetching (fast fading)

10

5

0 10 0

(b) Expected energy consumption vs. number of CTs

w/ prefetching (slow fading) w/o prefetching (slow fading)

16

w/ prefetching (fast fading) w/o prefetching (fast fading)

14 12 10 8 6

5

6

7

8

9

10

11

12

13

Latency requirement N (slots)

(c) Expected energy consumption vs. latency requirement

Expected energy consumption (dB)

Expected energy consumption (dB)

(a) Expected energy consumption vs. TD size 20 18

10 1

Number of candidate tasks L

20

w/ prefetching (slow fading)

18

w/o prefetching (slow fading) 16

w/ prefetching (fast fading)

14

w/o prefetching (fast fading)

12 10 8 6

0

1

2

3

4

5

6

7

8

9

Prefetching duration NP (slots)

(d) Expected energy consumption vs. prefetching duration

Fig. 4. The effects of MCO parameters on the (mobile) expected energy consumption for the cases of slow and fast fading: (a) TD size Γ with NP = 4, N = 5 and L = 4; (b) the number of CTs L with NP = 4, N = 5 and Γ = 20; (c) the latency requirement N (in slot) with NP = 4, L = 4 and Γ = 20; (d) the prefetching duration NP (in slot) with N = 10, L = 4 and Γ = 20.

The monomial order and the energy coefficient of the energy consumption model in (1) are set as m = 2 and λ = 1, respectively. The transition probabilities {p(`)} and the data size {γ(`)} are generated based the uniform distribution. The curves of the expected energy consumption (in dB) are plotted against different MCO parameters in separate subfigures of Fig. 4. Since the vertical axes of the figures are on the logarithmic scale, the vertical gaps between the curves measure the expected prefetching gain E[GP ] over the random task sets. First, one can observe that the prefetching gain is always larger than one in all sub-figures. Second, the prefetching gain of fast fading is always larger than that of slow fading. Third, with both axes on the logarithmic scale, the straight lines in Fig. 4(a) imply that the expected energy consumption increases as a monomial of the TD size, which agrees with energy consumption model in (1). Fourth, the expected prefetching gain is observed from Fig. 4(b) to be more than 5 dB when the number of CTs L is small but the gain diminishes as L increases Fifth, Fig. 4(c) shows that the prefetching gain diminishes as the latency requirement is relaxed.In other words, prefetching is beneficial in scenarios with relatively stringent latency requirements. Last, the gain grows with the prefetching duration in Fig. 4(d). VII. C ONCLUSION A novel architecture of online prefetching for mobile computation offloading has been proposed to enable prefetching based on task-level computation prediction and its simultaneous operation with cloud computing. Given stochastic sequential tasks, the optimal prefetching polices have been designed to minimize mobile energy consumption over fading channels under the assumption of non-causal CSI. The simple threshold based structure of the policies is derived, which enables lowcomplexity online operation. Comprehensive simulation shows that online prefetching achieves significant mobile energy reduction than the conventional scheme without prefetching.

ACKNOWLEDGMENT This work was supported by LG Electronics, the National Research Foundation of Korea (NRF2014R1A2A1A11053234), and Institute for Information & communications Technology Promotion (2015-0-00294). R EFERENCES [1] K. Kumar and Y.-H. Lu, “Cloud computing for mobile users: Can offloading computation save energy?” IEEE Computer, vol. 43, no. 4, pp. 51–56, Apr. 2010. [2] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. Wu, “Energyoptimal mobile cloud computing under stochastic wireless channel,” IEEE Trans. Veh. Technol., vol. 12, no. 9, pp. 4569–4581, Sep. 2013. [3] C. You, K. Huang, and H. Chae, “Energy efficient mobile cloud computing powered by wireless energy transfer,” IEEE J. Sel. Areas Commun., vol. 34, no. 5, pp. 1757–1771, May 2016. [4] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloading for mobile-edge computing with energy harvesting devices,” IEEE J. Sel. Areas. Commun., vol. 34, no. 12, pp. 3590 – 3605, Dec. 2016. [5] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of radio and computational resources for multicell mobile-edge computing,” IEEE Trans. Signal Inf. Process. over Netw., vol. 1, no. 2, pp. 89–103, Jun. 2015. [6] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation for mobile-edge computation offloading,” to appear in IEEE Trans. Wireless Commun. [Online]. Available: http://ieeexplore.ieee.org/document/7762913/ [7] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation offloading for mobile-edge cloud computing,” IEEE Trans. Netw., vol. 24, no. 5, pp. 2795–2808, Oct. 2016. [8] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A framework for cooperative resource management in mobile cloud computing,” IEEE J. Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700, Dec. 2013. [9] P. Shu, F. Liu, H. Jin, M. Chen, F. Wen, and Y. Qu, “etime: Energyefficient transmission between cloud and mobile devices,” in Proc. IEEE Int. Conf. Comput. Commun., Turin, Italy, Apr. 2013, pp. 195–199. [10] N. Master, A. Dua, D. Tsamis, J. P. Singh, and N. Bambos, “Adaptive prefetching in wireless computing,” IEEE Trans. Wireless Commun., vol. 15, no. 5, pp. 3296–3310, May 2016. [11] S.-W. Ko, K. Huang, S.-L. Kim, and H. Chae, “Live prefetching for mobile computation offloading,” to appear in IEEE Trans. Wireless Commun. [Online]. Available: https://arxiv.org/abs/1608.04878