Analysis of the Energy-Performance Tradeoff for Delayed Mobile Offloading

Mobile cloud offloading that migrates heavy computation from mobile devices to powerful cloud servers through communication networks can alleviate the hardware limitations of mobile devices for higher performance and energy saving. Different applications usually give different relative importance to the factors of response time and energy consumption. If a delay- tolerant job is deferred up to a given deadline, or until a fast and energy-efficient network becomes available, the transmission time will be reduced, which could lead to energy savings. However, if the reduced service time fails to cover the extra waiting time, this policy may not be suitable for the delay- sensitive applications. In this paper, we investigate two types of delayed offloading policies, the partial model where jobs can leave from the slow phase of the offloading process and then executed locally on the mobile device, and the full offloading model, where jobs can abandon the WiFi Queue and then offloaded via the Cellular Queue. In both models we minimise the Energy-Response time Weighted Product (ERWP) metric. We find that jobs abandon the queue very often especially when the availability ratio (AR) of the WiFi network is relatively small. We can optimally choose the reneging deadline to achieve different energy-performance tradeoff by optimizing the ERWP metric. The amount of delay a job can tolerate closely de- pends on the application type and the potential energy saving for the mobile device. In general one can say that for delay- sensitive applications, the partial offloading model is preferred when having a suitable reneging rate, while for delay-tolerant applications, the full offloading model shows very good results and outperforms the other offloading models when setting the deadline a large value.


INTRODUCTION
Besides light-weight Internet applications, there is still an increasing demand from mobile users for computation-heavy and energy-hungry applications that are being deployed to mobile devices. Running complex applications on such devices is however challenging due to the strict constraints on their resources, e.g., the limited computational capacity, battery lifetime and network connectivity. Mobile cloud computing aims at combining the strength of cloud computing and the convenience of mobile terminals. Offloading computation-intensive tasks from mobile devices to a capable cloud server via wireless net-works is an effective way to alleviate a tussle between resourceconstrained mobile devices and resource-hungry mobile applications, and thus boosts the device's performance.
Potential benefits obtained from offloading include response time shortening and energy saving. However, different applications usually give different relative importance to both factors. For delay-tolerant applications (e.g., iCloud, Dropbox, RSS feeds and participatory sensing), response time is less critical and optimising energy usage is more relevant. Some information is not time-critical and its submission to the server may be delayed until the device enters an energy-efficient network [1]. For delay-sensitive applications (e.g., speed chess game, face recognition, video conferencing and vehicular communications), fast response time is of primary concern while energy consumption is less important. The offloading scheme in which cloud services are available with short network latencies (e.g., WiFi networks) can serve in a better way by providing high responsiveness. Thus, there exists a fundamental tradeoff between mean energy consumption and mean response time in expecting applications [2]. Since performance can be defined as the inverse of the mean response time [3], the energy-performance tradeoff has been studied in [4,5] by deciding whether or not and by means of which communication interface to offload a whole application. Instead, an application can consist of several components (or jobs), and offloading decisions should be made for each. Seamless offloading operation by switching between several transmission technologies has been proposed in [6]. In addition, they examined the tradeoff between energy consumption for WiFi search and transmission rate when the WiFi network was intermittently available. Energy-efficient delayed network selection has been suggested in [2] to optimise the tradeoff between energy usage and delay in data transmission by intentionally deferring data transmission until the device meets an energy-efficient network. Researchers have further suggested the use of "delayed offloading": if no WiFi connection is available, (some) traffic can be delayed up to a given deadline, or until WiFi becomes available [7].
Mobile devices usually have multiple radio interfaces for data transfer, such as 3G/4G and WiFi with different availabilities, delays and energy costs. Thus, there are several ways to offload tasks to the cloud, e.g., via a costly cellular connection or via intermittently available WiFi [8,9]. By delaying offloading until WiFi becomes available, there are opportunities to reduce the transmission time while in the meantime bringing extra waiting time. The reduced transmission time is directly translated into battery power saving for the mobile device [7]. However, the delayed offloading is still a matter of debate, since it is not know what extent users would be willing to delay a transmission [10]. In this paper, we try to give an overall guidance of how to balance the time and energy saving for different types of scenarios like delay-tolerant and delay-sensitive applications.We develop a theoretical framework to capture the energy-performance tradeoff by using queueing models with impatient jobs and service interruptions. The models can be used to predict the average performance and energy consumption of arXiv:1510.09185v1 [cs.PF] 30 Oct 2015 mobile offloading under a given network environment deployment condition. The main contributions are as follows: • Proposing two types of queueing models for delayed mobile cloud offloading systems: the partial offloading model and the full offloading model. A non-delayed offloading model [10] is also introduced here as a comparison.
• Developing an analytical framework for analyzing queueing models with reneging and service interruptions. We obtain closed-form formulas for key performance metrics in the delayed offloading system such as Energy-Response time Weighted Product (ERWP), which combines the advantages of other previously studied metrics.
• Trying to answer the following questions: (i) Given a deadline, what are the expected response time and expected energy consumption as a function of network parameters and the job arrival rate? (ii) How should the deadlines be optimally chosen in order to achieve different energy-delay tradeoffs for specific applications? (iii) Among different offloading models, how to choose the optimal one that achieves the most performance gains based on the ERWP metric.
The remainder of this paper is as follows. In Section 2, we introduce the delayed offloading system and the queueing model as well as the considered metric. In Section 3, we analyse the partial offloading model based on the ERWP metric. The full offloading model is proposed and analysed in Section 4. Section 5 evaluates metrics and models using numerical examples. The paper is concluded in Section 6.

SYSTEM OVERVIEW
In delayed offloading, each data transfer is associated with a deadline, and the data transfer is resumed whenever getting in the coverage of WiFi until the transfer is completed [7]. If the transfer does not finish within its deadline, the task will either be executed locally or the cellular networks will finally complete the transfer.
We consider a queueing system for the delayed offloading. The mobile device, the cloud and the wireless networks are represented as queueing nodes to capture the resource contention and delay on the system. The mobile device executes an application with offloadable jobs that can be processed either locally on the processor of the mobile device, or remotely in a cloud infrastructure through offloading. The mobile device, the cellular and WiFi connections are modelled as M/M/1-FCFS queues, and the remote cloud is modeled as an M/M/∞ queue, i.e., as a delay center [11]. We denote 1/µm and 1/µr the expected execution time of jobs on the mobile device and the cloud, respectively. The expected rates to transfer data to the cloud over the cellular network and WiFi are µc and µw, respectively. The total cost, in terms of energy or response time for processing all the offloadable jobs, is composed of the remote cost (sending some jobs to the cloud and waiting for the cloud to complete them), and the local cost (processing the remaining jobs locally on the mobile device). Our objective is to minimise the mean energy consumption and the mean response time.
The delayed offloading systems involve queueing with reneging and service interruptions. In queueing, reneging means that a job will leave the queue and join another queue after the deadline expires. Service interruption literally means unwilling discontinuity of service in the queue, and this models connection and disconnection periods of a mobile device to WiFi networks in the system [12].

The WiFi Model
To facilitate the analysis of the mobile offloading systems, we assume that a cellular network is available to mobile users all the time while the availability of a WiFi network depends on the location. Mobile users move in and out of a WiFi coverage area. We model this time variation of the WiFi connection state by the ON-OFF alternating renewal process T OFF , i ≥ 1, as shown in Fig. 1. The ON periods represent the presence of the WiFi connectivity, while the OFF periods denote the interruption of the WiFi connectivity. During the latter periods data is either not transmitted (the interface is idle) or it is transmitted only through the cellular network. The duration of each ON period T (i) ON , is assumed to be an exponentially distributed random variable and independent of the duration of other ON or OFF periods [10]. Further, the WiFi availability ratio (AR) can be defined as AR = Idle/ Cellular

Delayed Offloading Models
Accordingly, we build two types of delayed offloading models based on the WiFi network availability model as follows: • Partial Offloading Model: we employ a single queue with two phases (the fast phase with WiFi network and the slow phase with cellular network) to offload jobs to the cloud server. When there is a WiFi connection available, all the offloadable jobs are sent over the WiFi network; otherwise, they are sent over the cellular interface as the cellular network is always available. We set a reneging deadline in the cellular network, if the deadline expires before the job switched over to some WiFi AP, then it is executed locally on the mobile device rather than remotely on the cloud [7]. By doing this, we have partial jobs offloaded to the cloud and the remaining ones processed locally.
• Full Offloading Model: when there is a WiFi connection available, all the offloadable jobs are sent over the WiFi network; otherwise, they can be delayed up to a given deadline, or until WiFi becomes available [13]. If the deadline expires before the job can be transmitted over some WiFi AP, then it is offloaded through the cellular network. In this way, we have all the offloadable jobs offloaded to the cloud via the cellular or WiFi network.

The ERWP Metric
The general cost metric includes energy consumption related costs in addition to the usual performance metrics such as the response time [14]. The response time is the time between the arrival of a job until it completes service and departs. The energy consumption is the energy spent on the mobile device in that period.
We use queueing theory to model the offloading systems according to a new metric called Energy-Response time Weighted Product (ERWP), which is defined as: where E[T ] and E[E] are the mean response time and mean energy consumption, respectively. ω (ranging between 0 and 1) is a weighting parameter that represents the relative significance of energy consumption and response time for the mobile device. Large ω favors energy consumption while small ω favors response time. Specifically, to focus on performance, ω should be less than 0.5; to focus on power consumption, ω should be greater than 0.5. In some special cases performance can be traded for power consumption and vice versa, therefore we can use the ω parameter to express such special cases preferences for different applications.
We obtain tight optimality results by deriving explicit expressions in mobile cloud offloading systems to capture energyperformance tradeoffs. Figure 2 depicts a delayed offloading model based on the WiFi network availability model [1]. We consider an M/M/1 modulated queue in a two-phase (fast and slow) Markovian random environment, with impatient jobs. The jobs are offloaded either via a cellular connection or a WiFi network to the cloud. The single-server queuing system that oscillates between two feasible phases is denoted by fON and fOFF. The persistence of the system at any phase is governed by a random mechanism: if the system functions at phase fON it tends 'to jump' to the other phase with Poisson intensity ξ and if the system functions at phase fOFF it tends 'to jump' to the other phase with Poisson intensity η [15]. Figure 2: Partial offloading model with cellular and WiFi networks

PARTIAL OFFLOADING MODEL
We assume that offloading jobs arrive at the system according to a Poisson process with rate λ, and the modulating process f ∈ {fON, fOFF} determines the service rates: The average job size is E[X], the transmission speed of the fast phase (WiFi network) is sw with service rate µw = sw/E[X], and its operating power is pw when serving jobs and zero whenever idle. Similarly, the corresponding speed for the slow phase (cellular network) is sc with service rate µc = sc/E[X] (µc ≤ µw), and its operating power is pc.
When in the slow phase, jobs become impatient. A reneging deadline T d , is associated with each job in this phase. That is, each job, upon arrival, activates an individual 'impatience timer', exponentially distributed with an reneging rate R. If the system does not change its environment from the slow phase to the fast phase before the deadline expires, the job will be removed from the Offload Queue and is assumed to be executed locally on the mobile device rather than offloaded to the cloud [16]. Therefore, Fig. 2 demonstrates that the delayed offloading model consists of an Offload Queue (with two alternating phases of cellular and WiFi), a Local Queue denoting the local processing on the mobile device and a Remote Queue representing the remote processing on the cloud server.
The Offload Queue alternates its service by means of mutual resets according to the availability of WiFi, which is governed by an interrupted Poisson Process (IPP) with exponentially distributed ON-OFF periods. We model the intermittent availability of WiFi hotspots as a FCFS queue with occasional server break-down [8], either in ON-state where the WiFi network is processing the existing jobs, or in the OFF-state during which the job is serving by the cellular network (the cellular connectivity is assumed to be always available). However, when the job stays in cellular network for too long time, it abandons the Offload Queue and is then processed locally on the mobile device. We assume that the sojourn time in a hotspot and the time to move from one hotspot to another are exponentially distributed with parameters ξ (failure rate), and η (recovery rate), respectively. If the job in the Offload Queue is completely transmitted before the assigned deadline has expired, we say that the job is successfully offloaded. If offloading fails, the job leaves the Offload Queue and join the Local Queue on the mobile device for immediate local processing. We call such an event a reneging event [12].
Since there is no waiting time before entering service, the M/M/∞ queue of the Cloud is occasionally referred to as a delay (sometimes pure delay) station, the probability distribution of the delay being that of the service time.

Queueing Analysis
We use queueing analysis to derive formulas for the average number of jobs for an M/M/1 queue operating in a 2-phase network environment. Given the previously stated assumptions, the partial offloading model can be modeled with a 2D Markov chain, as shown in Fig. 3. The states with cellular network are denoted with {c, i}, and the states with WiFi connectivity are denoted with {w, i}. i corresponds to the number of jobs in the system (queuing and in service). During the WiFi phase, the system empties at rate µw and during the cellular phase, the system empties at rate µc +i·R since any of the i queued jobs can abandon the Offload Queue [13]. Writing the balance equations for the cellular and WiFi phases gives: (λ + η)πc,0 = (µc + R)πc,1 + ξπw,0 The steady-state probability of finding the offloading system in some region with WiFi unavailability (with only cellular ac- Similarly, the steadystate probability for the periods with WiFi availability is πw = η+ξ , which equals to the availability ratio AR. The probability generating functions for both cellular and WiFi states are defined as: After some calculation and algebraic manipulations, we obtain:

General Case
Assume the reneging rate R = 0, we have the partial offloading model as depicted in Fig. 2. According to [16], we obtain: where we define S = z 1 0 x dx. Accordingly, κ1(z) and κ2(z) are represented as follows: By the definitions of κ1(z), κ2(z) and β(z), it follows that T, U, V > 0 and S < 0. Therefore, πc,0 and πw,0 are positive. One can show formally that the system is ergodic. Intuitively, we indicate that the system is always stable since, with any set of parameters λ ≥ 0, µc ≥ 0, µw > 0, ξ > 0, η > 0 and R > 0, the abandonment process, whose overall rate increases with the number of jobs, prevents explosion [16]. Alternatively, the system is stable if and only if πc,0 and πw,0 are positive, which always holds for the above set of parameters.
Let µ be defined as: µ = πc · µc + πw · µw. According to [16], we obtain: From Fig. 3, the expected number of jobs served per unit of time in the slow phase and fast phase are µc(πc − πc,0) and µw(πw − πw,0), respectively [17]. Therefore, the rate of abandonment due to impatience in the slow phase, λ aband , is given by: where the abandonment rate is proportional to the reneging rate and the mean number of jobs in the cellular phase.
The rate λm that jobs are executed locally on the mobile device must be equal to the abandonment rate, i.e., λm = λ aband . The probability that an arbitrary job arriving to the Offload Queue will leave and join the Local Queue, i.e., it will be executed locally and will never be offloaded again, is defined as: where Pr denotes the probability operation.

Extreme Case
Assume the reneging rate R → 0, the partial offloading model as shown in Fig. 2 reduces to a non-delayed offloading model (or on-the-spot offloading [7]), which is depicted in Fig. 4. Since the reneging rate is zero, there will be no Local Queue in this model. We use this model as a reference case for comparison purpose with the delayed offloading models.
After some algebraic manipulations, we obtain: Once the values of πc,0 and πw,0 have been established, according to Eq. (4), the probability generating functions can be calculated as: . (14) By using , we get the average number of jobs in the system [18]:

Mean Response Time
The total cost for offloading a job is composed of the cost for sending the job to the cloud and idly waiting for the cloud to complete the job.

Mean Energy Consumption
A key assumption in our work is that each service operates at a constant power pi, i ∈ {c, w, m} whenever it is busy, i.e., the mobile device consumes energy only when there are jobs in the system. Since E[P ] = λE[E] is the mean power consumption, we can calculate the mean energy consumption for the partial offloading model as: . (17) Since the application jobs are remotely executed in the cloud server rather than on the mobile device, we do not need to calculate such energy consumption. For i ∈ {c, w, m}, the corresponding average power consumption can be calculated as: Since the utilization of the queue is the probability that the server is busy, we have Pr{Ni > 0} = ρi [19], i.e., the energy cost is only incurred during the fraction of the time the server is busy.
The energy consumed due to local execution depends on the processing speed of the mobile device. Since the service on mobile device is always available, we have: The mean energy consumed due to offloading via cellular or WiFi network depends on the transmission power and speed.
We have: where ρc and ρw are the utilizations of the cellular and WiFi networks, which are equal to the probability that the corresponding network is busy. According to Fig. 3, they can be separately calculated as: ρc = πc − πc,0 and ρw = πw − πw,0.

ERWP Metric
Further, by substituting Eqs. (16) and (17) into Eq. (1), we can formulate the explicit expressions and the optimization problem of the ERWP metric for the offloading assignment as: we seek the reneging rate R * such that ERWP is minimised. Figure 5 depicts another delayed offloading model based on the WiFi network availability model. All jobs arriving to the system are by default sent to the WiFi interface for offloading. When a job is offloaded to the cloud via a WiFi network, there is queueing due to the transmission speed of the WiFi link. We model the intermittent availability of hotspots as a FCFS queue with occasional server break-down. The server availability is governed by an IPP with exponentially distributed ON-OFF periods. Specifically, the server is either in ON-state processing the existing jobs, or in OFF-state during which no job receives service. We assume the jobs will abandon the queue during periods without WiFi connectivity. We assign a reneging deadline for each job (drawn from an exponential distribution). Jobs are serviced in the FCFS order depending on their remaining deadlines (either while queued or while at the head of the queue, but waiting for WiFi). A job can be served only via WiFi before its deadline. As the queueing system is continuous, it handles transmission at the bit level so that assigning a deadline to a job is equivalent to assigning the same deadline to each bit of the job [7]. When in the OFF-state, jobs become impatient. That is, each job, upon arrival, activates an individual timer, exponentially distributed with an reneging rate R. If the network does not change its environment from the OFF-state to the ON-state before the deadline expires, the job abandons the WiFi Queue, and instead, to be offloaded via a cellular network [13]. If the job in the WiFi Queue is completely transmitted through WiFi networks before the assigned deadline has expired, we say that the job is successfully offloaded. If offloading fails, the job leaves the WiFi Queue and join the Cellular Queue in the mobile device for immediate transmission through cellular networks. We call such an event a reneging event.

FULL OFFLOADING MODEL
When the job is offloaded to the cloud via a cellular network, there is queueing due to the transmission speed of the cellular link. Costs arise in terms of transmission delays (queueing and actual transmission time) and transmission energy consumption. Service is always available since the cellular connection is always on. Similarly, the Remote Queue is a pure delay station at which jobs spend an exponentially distributed amount of time with mean equal to 1/µr time units.

Queueing Analysis
The WiFi Queue refers to offloading jobs from the mobile device to the cloud via a WLAN network, which is modeled as an M/M/1-FCFS queue with intermittently available service. When a server recovers, it continues to serve the job whose service has been interrupted, i.e., the work already completed is not lost (cf. data transfers with resume) [8].
We assume that the service fails from time to time and resumes its operation after a random time. The Markov chain for the WiFi Queue is depicted in Fig. 6, which is equivalent to assuming that µc = 0, πON = πw and πOFF = πc in Fig. 3. i corresponds to the number of jobs in the system (queuing and in service). During the ON-state, the system empties at rate µw and during the OFF-state, the system empties at rate i · R since any of the i queued jobs can abandon the WiFi Queue [13]. Writing the balance equations for this chain gives: (λ + η)πOFF,0 = ξπON,0 + RπOFF,1 (λ + η + iR)πOFF,i = λπ1,i−1 + (i + 1)RπOFF,i+1 + ξπON,i (λ + ξ)πON,0 = ηπOFF,0 + µwπOFF,1 (λ + ξ + µw)πON,i = λπON,i−1 + µwπON,i+1 + ηπOFF,i After substituting µc = 0 into κ1(z) and κ2(z), yields: According to [13], we obtain: We further have µ = πc · µc + πw · µw = πONµw. After substituting the above values in Eqs. (7) and (8), we derive the mean number of jobs in WiFi Queue as: Therefore, the average number of jobs in the WiFi Queue can be calculated as: As shown in Fig. 6, the expected number of jobs served per unit of time in the WiFi Queue is µw (πON − πON,0). Therefore, the rate of abandonment due to impatience in the OFF periods, λ aband , is given by: where the abandonment rate is proportional to the reneging rate and the mean number of jobs in the OFF-state.
The rate of jobs sent back to the cellular network λc must be equal to the abandonment rate, i.e., λc = λ aband . The probability that an arbitrary job arriving to the WiFi Queue will abandon, i.e., it will be offloaded over a Cellular Queue, is defined as:

Metric-Based Analysis
By Little's Law, E[N ] = λE[T ], the mean response time can be calculated as: where E[Nw] is the average number of jobs in the WiFi Queue as obtained in Eq. (28).
The Celluar Queue refers to offloading jobs from the mobile device to the cloud via a cellular network, which is modeled as an M/M/1-FCFS queue. Since the arrival rate to the Celluar Queue equals to the abandonment rate of the WiFi Queue, i.e., λc = R · E[NOFF]. The average number of jobs in this queue is given by: where ρc = λc/µc is the probability that the Cellular Queue is busy.
Since all the jobs are offloaded to the remote server in the cloud, for an M/M/∞ queue, the average number of jobs in the cloud server can be calculated as: E[Nr] = λ/µr.
The mean energy consumption can be calculated as: where ρw is the fraction of time that WiFi is available to process jobs, and it can be calculated as: ρw = πON − πON,0, as the recovery rate η → ∞, the availability of WiFi πON = AR = η ξ+η tends to be 1.
Further, by substituting Eqs. (31) and (33) into Eq. (1), we can formulate the optimization of the ERWP metric for the offloading assignment as: we seek to find the reneging rate R * such that ERWP is minimised.

PERFORMANCE EVALUATION
We consider here a simple scenario where the transmission rate of the cellular network is smaller than that of WiFi, i.e., sc < sw and the power consumption when transmitting jobs via the cellular link is larger than the WiFi link, i.e., pc > pw.
Using measurements from real traces collected by [7], the average data rates of the cellular and WiFi networks are set as sc = 200 Kbps and sw = 2 Mbps, respectively. The average duration of WiFi availability period is 52 min (ξ = 1/52 min −1 ), while the average duration with only cellular network coverage is 25.4 min (η = 1/25.4 min −1 ). The availability ratio is thus 67%. The mean job size is assumed to be 10 MB.
According to the power models developed by [20], we set the power coefficients pc = 2.5 W, pw = 0.7 W and pm = 2 W, respectively. Besides, suppose that the total job arrival rate is λ = 0.5 packet/min, the mobile service rate µm = 0.2 and the cloud service rate µr = 1.
An availability ratio of 11% has been reported in [21]. In Fig. 7 as the availability ratio (AR) of the WiFi network increases, the percentage of jobs abandon the Offload Queue (for the partial offloading model, refer to Fig. 7(a)) or the WiFi Queue (for the full offloading model, refer to Fig. 7(b)) declines rapidly. However, the full offloading model has much higher reneging probability than the partial one under the same deadline T d . That's because the partial offloading model can use the cellular network to transmit data, and thus the number of jobs waiting in the Offload Queue is reduced. On the other hand, as the reneging deadline increases from 60 min to 120 min, jobs have more chance to be offloaded via the WiFi network, and therefore the reneging probability decreases at the lower level of arrival rates. However, at high arrival rates, the reneging probability stays the same under different deadline.
The partial offloading model in Fig. 8(a) has the lowest average response time, since it takes full use of the slow phase of the cellular network during the WiFi is in the unavailable period. For the lower deadlines (T d < 40 min), the mean response time decreases as the deadline arises, since jobs with higher deadlines has more chance to transmit with the fast WiFi network, leading to smaller response time. However, the mean response time increases for higher deadlines, since jobs with lower deadlines leave the queue earlier, leading to smaller queueing delays. From Fig. 8(b), when the reneging deadline is small, the non-delayed offloading model achieves the lowest mean energy consumption among the three models, but as the deadline increases, the full offloading model is much more preferred. This is due to the fact that the WiFi network is much more fast and energy-efficient than the cellular network. The reduced serving time can cause less energy consumption on the mobile device.
We then fix the reneging deadline with 120 min. In Fig. 9(a), the mean response time arises with the increase of job arrival rate λ due to the queueing effects. The partial offloading model performs much better than the other two models since it fully uses the unavailable periods of WiFi by offloading jobs with a cellular network, which in turn brings huge energy consumption    as shown in Fig. 9(b). The full offloading model is much more energy-efficient than the non-dealyed offloading model at low λ, while at high λ, the non-delayed offloading model saves much more energy. This can be drawn from Fig. 7(b) that as λ increases, more jobs are abandoned from the WiFi Queue and are then offloaded via costly cellular network, which result in more energy consumption.
Different applications usually have relative energy and performance importance, we then use the ERWP metric to compare the three offloading models. It can be observed from Fig. 10(a) that when ω is small, the partial offloading model can achieve the smallest ERWP value by optimally choosing the reneging rate R, which indicates that when considering response time more important (for delay-sensitive applications), it is better to use the partial offloading model. Otherwise, when considering energy consumption more important than response time (for delay-tolerance applications), the full offloading model is much more preferred, which translates the reduced transmission time from the fast WiFi network into battery power saving for the mobile device. As shown in Fig. 10(b), when the weighting parameter ω is small, as the arrival rate of the offloadable jobs λ increases, all the three offloading models perform worse. However, the non-delayed offloading model is more sensitive to the arrival job rates. The partial offloading model can always achieve the smallest ERWP value, which that when considering response time more important, it is better to use the partial offloading model. Otherwise, when considering energy consumption more important than response time, the full offloading model is much more preferred at lower λ. While at higher rate, the non-delayed offloading model is preferred.

CONCLUSIONS
In this paper, we have developed queueing analytic models for delayed mobile cloud offloading to leverage the complementary strength of WiFi and cellular networks by choosing heterogeneous wireless interfaces for offloading. We have carried out optimality analysis of the energy-performance tradeoff for mobile cloud offloading systems based on the ERWP metric, which captures both energy and performance metrics and also intermittently available access links.
When the availability ratio (AR) of the WiFi network is relatively small, the percentage of jobs abandon the queue is also very high. We can optimally choose the reneging deadline to achieve different energy-performance tradeoff by optimizing the ERWP metric. We find that for delay-sensitive applications, the partial offloading model is preferred when setting a middle deadline, while for delay-tolerant applications, the full model shows very good results and outperforms the other offloading models when setting the deadline a large value. In general one can say that the partial offloading policy is faster, while the full policy uses less energy.