Task scheduling in cloud computing based on meta-heuristic techniques: A review paper

Cloud computing delivers computing resources like software and hardware as a service to the users through a network. Due to the scale of the modern datacentres and their dynamic resources provisioning nature, we need efficient scheduling techniques to manage these resources. The main objective of scheduling is to assign tasks to adequate resources in order to achieve one or more optimization criteria. Scheduling is a challenging issue in the cloud environment, therefore many researchers have attempted to explore an optimal solution for task scheduling in the cloud environment. They have shown that traditional scheduling is not efficient in solving this problem and produce an optimal solution with polynomial time in the cloud environment. However, they introduced sub-optimal solutions within a short period of time. Meta-heuristic techniques have provided near-optimal or optimal solutions within an acceptable time for such problems. In this work, we have introduced the major concepts of resource scheduling and provided a comparative analysis of many task scheduling techniques based on different optimization criteria.


Introduction
Cloud Computing (CC) is the latest technology with a fast outgrowth in the field of distributed computing. It confers the users with high reliability, security, scalability, costeffective mechanism, group collaboration and ease of access to various applications and resources [1]. It is a model for enabling appropriate, on-demand provisioning of computing resources such as software, hardware, applications, and services that can be fast provisioned and freed with least management overhead or interaction from service providers [2]. Cloud computing offers three primary types of service models namely Software-as-a-Service (SaaS), Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) [3]. Cloud computing can be implemented as a layered architecture and comes in four main development models namely public, private, community, and hybrid clouds [4].
The major concept used in cloud computing is virtualization. Virtualization is a technique by which the user can easily access the computing resources without considering the complexity and internal details of the system [5]. It enables the user to create Virtual Machines (VMs) on physical servers [6], which leads to reducing the required hardware equipment and improving physical resources utilization in cloud computing. There are several advantages provided by clouds to cloud users and the service providers, the major advantages of cloud computing are described in [5,[7][8][9][10] and listed below: • Reducing the cost by providing computing resources on-demand based on a pay-as-yougo system.
• Allowing easy accessibility from anywhere in the world and at any time.
In spite of the significant advantages of cloud computing environments, some significant issues have influenced the efficiency and reliability of this environment [11]. Cloud computing faces many issues that have attracted researchers' attention and concern. In general, the major issues in the environments of cloud have been categorized into seven main categories: resource management, load balancing, privacy and security, migration to clouds, availability and scalability, energy-efficiency, interoperability and compatibility [12][13][14][15]. Scheduling in the cloud computing environment means many tasks can be executed on the available pool of computing resources in an optimal way. This operation depends on many optimization criteria such as reliability, makespan, load balancing, execution cost, budget, utilization [16]. In the procedure of scheduling tasks, tasks are delivered from users to cloud scheduler, then the cloud scheduler explores the status of the resources from cloud information service. Later, mapping the tasks on various resources based on their requirements [17]. The efficient scheduler assigns the appropriate resources (e.g. VMs) to the tasks in an optimal way. Generally, the operation of allocating tasks on apparently unbounded computing resources in the cloud computing environment is a nondeterministic polynomial time (NP)-hard problem. Many researchers attempted to explore an optimal solution with polynomial-time for the task scheduling in the cloud environment. There is no specific technique that has introduced an optimal solution with polynomial-time for this problem. Thus, the techniques based on meta-heuristics have been used to deal with these complex problems to obtain near-optimal or optimal solutions. In the past years, many metaheuristics techniques have been introduced and gained considerable popularity such as genetic algorithms (GA), particle swarm optimization (PSO), ant colony optimization (ACO), tabu search (TS), simulated annealing (SA), bat algorithm (BA), memetic algorithm (MA) [18].
In order to develop an effective scheduling algorithm, we need to clearly understand resource management and various problems associated with different scheduling techniques. Thus, the objective of this paper is to present the major concepts of resource scheduling and provide a comparative analysis of various task scheduling techniques. Systematic analysis of task scheduling in cloud computing is presented based on optimization criteria suitable for cloud computing environments. This paper will help researchers to identify the suitable approach for suggesting adequate technique for scheduling user's applications in cloud environment.
In this work, we consider only the scheduling problems with regards to cloud computing and not the whole distributed systems. The remainder of the paper is organized in five sections. We present resource management in cloud computing in section 2. Scheduling is discussed in section 3. The discussion is presented in section 4. Finally, a conclusion and future work remarks are summarized in section 5.

Resource Management in Cloud Computing
Resource management is an important challenge in distributed computing such as cloud computing [19]. In cloud computing, various cloud users require different services depending on their changing needs. So, the task of cloud computing is to introduce all the required services. However, due to the limitation of available resources, it is difficult for cloud service providers to provide all the required services in a timely manner. Because cloud computing relies on virtualization technology with a distributed model, it becomes easy to introduce dynamically new resources which was difficult in the traditional resource management techniques [20]. In the next section, we present the type of resources and address the challenges of resource management in the cloud computing environment.

Type of Resources
In the following section we briefly introduce the classifications of the main types of resources based on their services such as storage, computation, network, security, and energy. power of processing, the capacity of memory, efficient algorithms, operating system [22]. 3. Network services: Network as a Service (NaaS)  consists of physical resources such as physical  network links, sensors, workstations and  intermediate devices, and logical resources such  as protocols, throughput, bandwidth, delay, loads and virtual network links [23]. Storage services and computation services cannot be thought of without network services such as bandwidth and delay. These services are the most significant services from network point of view, because every service in cloud computing is provided through high speed Internet [24]. 4. Security services: security as a service (SECaaS) is one of the important challenges in cloud computing environment [25]. SECaaS introduces high protection for users from attacks and threats over the internet [26]. It introduces services such as authentication, trust, intrusion detection, penetration testing, anti-malware, anti-virus, and security event management [27]. 5. Energy services: Energy consumption in the cloud data centers is very high. Energy service consists of physical resources such as cooling devices and uninterruptable power supplies (UPS). Many energy saving techniques have been introduced to manage unused resources to reduce the cost. Significant energy can be saved in data centers by applying energy saving techniques on servers and networks [28].

Challenges of resource management in cloud
The important challenges that are commonly associated with resource management in cloud systems are resource allocation, resource provisioning, resource mapping, resource discovery and selection, resource adaptation, resource brokering, and resource scheduling. We will briefly introduce the basic concept of these challenges: Resource allocation: is the economical distribution of cloud resources among different applications through the internet [29].
Resource provisioning: is the process of allocating the service provider's resources to the cloud users with the service quality assurance which is determined in the service level agreement (SLA). It can be classified into two types: dynamic and static resource provisioning [30].
Resource mapping: is a consistency between resources available with a service provider and resources required by cloud users [31].
Resource discovery and selection: is the process of discovering all resources presented in the system and collecting the current state of resources then making decisions which target resource should be selected based on the information obtained from the discovery [32].
Resource adaptation: is the capability of this system to dynamically adjust resources to meet user requirements [26].
Resource brokering: is the process of negotiation for the required resources through an agent to guarantee that necessary resources are available in time to complete the objectives [20].
Resource scheduling: defined by [26], as a timetable of events and resources that records when an activity should start or end, depending on its (1) duration, (2) predecessor activities, (3) predecessor relationships, and (4) resources allocated.

Scheduling
The main objective of scheduling is the optimal allocation of resources to specific tasks in a limited time to achieve a high-performance computing and desirable quality of service.
The scheduling must schedule the given tasks to available resources subject to certain constraints to improve one or more optimization criteria [33]. In distributed computing systems, scheduling is responsible for selecting the appropriate resources for task execution taking into consideration some dynamic and static task's parameters [34]. Scheduling algorithms differ by the nature of tasks in the application. When a task has a sequence, the task can be scheduled only after all of its main tasks have been completed, this is called workflow scheduling. In another case, when tasks are independent of each other, they can be scheduled in any order and known as independent task scheduling [35].

Scheduling Procedure
Scheduling procedure in cloud computing can be classified into three phases: resource discovering and monitoring, resource selection, and task submission [36]. Figure 1 clearly depicts the three stages: • Resource discovering and monitoring: In the first phase, Datacenter Broker (DB) detects all resources presented in the cloud system and gathers the current state of all available resources and all remaining resources that may be available in the cloud system. Indeed, these resources are generally the virtual resources. • Resource selection: In the second phase, the cloud scheduler makes decision which target resource should be selected based on the information obtained from the discovery phase. • Task submission: In the last phase, the task is assigned to the best available selected resource.

Figure 1. Procedure of scheduling in cloud
Datacenter Broker resides between the cloud service provider and cloud user. It collects all the information about all resources presented in the cloud system and their current status [37]. A datacenter contains a large number of hosts and related equipment. Each host can have multiple VMs based on its hardware specifications (CPU, RAM, bandwidth) [38]. This information needs to be stored in a depository for future use by DB. Thus, Cloud Information Services (CIS) is used as a depository to store the cloud datacenter entities. Once a data center is created, it has to be registered on a CIS. When a cloud user requests a service, he/she sends tasks to the datacenter broker which gathers the information about the available resources from the CIS then it allocates the tasks to each virtual machine based on the scheduling policy which is defined in the datacenter broker.

Cloud Resource Scheduling layers
Cloud computing contains three primary types of service models namely IaaS, PaaS, and SaaS, which can be implemented as a layered cloud computing architecture. Based on this architecture, the cloud scheduling problems can be classified into three scheduling layers: Infrastructure layer, platform layer and software layer [39].

Scheduling in Software Layer
Scheduling at the software layer answers the question, how to allocate resources for tasks to satisfy user's objectives such as task completion time (makespan), reliability, costs and application performance. Also, to meet with the desire of cloud service provider to schedule available cloud resources effectively to save the energy consumed by the data center [40].

Scheduling in Platform Layer
The physical resources like network resources, storage resources, and computational resources [41] are better virtualized as uniform resources. Therefore, scheduling these resources in the platform layer concentrate on how to migrate and map virtual resource into a physical resource with an effective load balance and cost.

Scheduling in Infrastructure layer
Cloud service providers need to build a number of cloud data centers around the world to introduce services to the cloud users. So, service resources must be deployed efficiently at different locations of cloud data centers around the world. In addition, other cloud providers can be connected with each other to compose a cloud federation that provides more efficient cloud services. Thus, when scheduling in the infrastructure layer, issues such as scheduling data routing and cloud federation should be addressed and resolved [42]. Cloud federations interconnect the cloud computing environments of two or more service providers to participate together and deliver their services to the cloud users as a single service. In cloud computing environment, data routing is to find cloud resources fast in a multi-cloud environment [43].

Task-Resource Scheduling Problem Formulation
In cloud computing, task scheduling optimization should define the optimal number of required systems so that the total cost is minimized. Assuming that there are n tasks their execution time on each processing machine is known and they should be processed on m available computational resources. The goal is to maximize the 5 utilization of the available resources and minimize the total execution time. Assume that the number of tasks is more than the number of available resources (n > m), and tasks are not allowed to migrate between resources [44]. To formulate the problem, consider the set of tasks defined as Ti={1,2,…n where n is the number of independent tasks and Rj={1,2,…m} where m is the number of computational resources. Therefore, cloud resource scheduling problem is to get an optimal mapping (OM) of tasks (Ti) to resources (Rj) OM: TiRj. The definition of this problem is depicted in Figure 2, where, two or more tasks may share one resource [45].

Optimization Criteria
This section explains the parameters used to measure the effectiveness of scheduling. The existing works have addressed different kinds of optimization criteria such as makespan, cost, budget, deadline, resource utilization, throughput, load balancing, and energy efficiency. Generally, these optimization criteria are categorized into two desires based on cloud service: cloud users desire and cloud service providers desire, figure 3 [47]. These optimization criteria are addressed from most of the reviewed works, thus this work tries to demonstrate the way these criteria are studied in a comparative method.

User Desire Criteria
• Makespan (completion time): Makespan is defined as the completion time of the last task that is required to complete and leave the cloud system [48]. • Cost: cost is the total amount the user pay to a service provider on the basis of their resource usage [49]. • Budget: it indicates the constraints on completing the tasks within the budget [50]. • Deadline: it represents the termination of running tasks at a certain time [51].

Provider Desire Criteria
• Resource utilization: making the most of the available resources and keep resources as busy as possible. It is useful for service providers to get gain by leasing the finite resources to the cloud user on-demand [52]. • Throughput: it measures the number of completed tasks per unit time [53].
• Load balancing: load balancing in cloud computing is the distributions of loads evenly between the VMs over physical resources. Many techniques have been introduced by the authors in [54][55][56]. • Energy efficiency: energy efficiency can be defined as a reduction of energy consumed by a task [57].

Figure 3. Optimization Criteria
Optimization problems can be divided into discrete and continuous problems. The decision variables for a combinatorial problem have discrete values; while the decision variables for a continuous optimization problem can take up values within the domain of real values (Ri) [58,59]. According to the number of criteria involved in the optimization problem, this can be divided into singlecriterion and multicriteria. The task of single-criterion optimization is to find the optimal solution according to only one criterion function. When the optimization problem involves more than one criteria function, the task is to find one or more optimal solutions regarding each criterion. Here, a solution which is good with respect to one criterion can be worse for another, and vice versa [60]. Therefore, the goal of multi-criteria optimization is to find a set of solutions that are optimal with respect to all other criteria. Noticeable, most real-world problems are multi-criteria. Nowadays, there exist optimization techniques that search for solutions by using Metaheuristic and heuristic based search techniques. Stochastic and deterministic search principles are applied in these techniques. If an algorithm successfully solves all instances of problem (P), then we can say that it is capable of solving that problem. Usually, we are interested in which technique solves the problem more efficiently. Normally, the term efficiency is connected with the resources of the computer (space and time) that are occupied by running a technique [61,62]. Generally, the most efficient technique is the one that finds the solution to the problem in the fastest way. In practice, the time complexity of an algorithm is not measured by the effective time necessary for solving the problem on a concrete computer because this measurement suffers from a lack of criteria. The same algorithm could be run on Task Scheduling in Cloud Computing Based on Meta-Heuristic Techniques: A review paper EAI Endorsed Transactions on Cloud Systems 11 2019 -05 2020 | Volume 6 | Issue 17 | e4 different operating systems or even on different hardware configurations. Therefore, the algorithm's complexity is measured in an informal way that determines the complexity with regard to the amount of input data, necessary for the problem description. The time complexity of an algorithm determines the way in which the increase in the instance size influences the time complexity. This relation can be expressed with the so-called asymptotic time complexity function O(f^(n)) that determines the upper bound of time complexity for problem P. For example, the function O(n 2 ) denotes that the increase in the instance size n will cause an increase in the time complexity to almost n 2 . The algorithmic theory divides problems, with regard to the asymptotic time complexity function, into two classes: NP-hard and Phard. In the first class, problems that demonstrate the exponential time complexity O(2 n ) and are, therefore, treated as "complicated." That is, the exponential time complexity may cause that some increase in the input data can increase solution time of the problem exponentially. In the worst case, we could be waiting for the solution over an infinite period of time. In other hands, problems of class P-hard have polynomial time complexity O(n k ) and are treated as "simple." [62].

Task Scheduling Techniques
There are several types of scheduling techniques for distributed computing systems. Figure 4 depicts the three main types of scheduling techniques namely Metaheuristic techniques, traditional techniques, and heuristic techniques. The Meta-heuristic techniques are classified into two categories: Swarm Intelligence(SI) and Bio-Inspired(BI) [62].  [63]. These techniques are simple, fast, deterministic and obtain exact solutions [64]. But, they are not efficient to understand the optimality problem in many situations [65]. So, traditional techniques are not feasible in cloud environment scheduling [66]. Many works have been carried out to improve the implementation of the traditional techniques [63,[67][68][69][70]. Round robin is one of these techniques that work using time slice or a quantum. The RR algorithm has a drawback that it utilizes static time quantum [67]. The proposed CPU scheduling in [68] relies on the round-robin scheduling, but changes the way of the scheduling calculations. It decreases the waiting time and turnaround time radically contrasted with the straightforward RR scheduling, rather than giving static time quantum in the CPU scheduling. FCFS algorithm means that, task that comes first will be executed first. Researchers in [69] proposed an algorithm for task scheduling based on fuzzy clustering algorithms to increase the resource utilization and minimize the task execution time. SJF is a scheduling technique that depends on the execution time of the task. The tasks are queued based on priority, the longest time is placed last with the lowest priority and the smallest time is placed first with the highest priority [71]. In this algorithm CPU is assigned to the task with least burst time. Elmougy et al. in [70] proposed a hydride algorithm of RR and SJF called SRDQ algorithm. This algorithm considers a dynamic variable task quantum time.

Heuristic Techniques:
These techniques are using sample space of random solutions to find the optimal or near optimal solution [66]. Many heuristic techniques exists such as min-min, priority-based min-min, enhanced max-min, and maxmin. [72]. These techniques give better results as compared to the traditional techniques, but do not guarantee to score high ranking in the cloud scheduling [73].
The solutions resulting from heuristic techniques often get stuck in the problem of local minima [66]. An improved Max-min technique using the expected execution time for selection basis instead of completion time is proposed in [74]. It allocates a task with average execution time. The algorithm increases the chance of synchronous assignment of tasks on resources. The basic Min-Min algorithm is a straightforward and effective algorithm that generates the best scheduling in terms of reducing task completion time. However, the biggest drawback is load balancing, which is considered to be one of the major challenges for cloud service providers. Authors in [75] have improved load balancing by proposing Load Balance Improved Min-Min(LBIMM) algorithm. LBIMM algorithm is designed based on the Min-Min algorithm in order to increase the resource utilization and decrease the completion time.

Meta-heuristic Techniques:
The problem of allocating tasks on resources in cloud computing environment is NP-Hard problem. Therefore, task scheduling is clarified by using meta-heuristic and heuristic to obtain near-optimal or optimal solutions. Heuristic techniques are subsets of meta-heuristic techniques. Meta-heuristic techniques are often nature-

Scheduling Heuristic
Meta-heuristic Traditional

EAI Endorsed Transactions on
Cloud Systems 11 2019 -05 2020 | Volume 6 | Issue 17 | e4 inspired by social behavior of insects [76]. The metaheuristic word is stated by Fred Glover in 1986, the prefix "meta" means higher level and "heuristic" means to discover by trial and error. We adopted a clear definition of the word metaheuristic from [77] "a metaheuristic is a high-level problem independent algorithmic framework that provides a set of guidelines or strategies to develop heuristic optimization algorithms. The term is also used to refer to a problem-specific implementation of a heuristic optimization algorithm according to the guidelines expressed in such framework". All meta-heuristic techniques have two main components intensification and diversification. Diversification generates assorted solutions to explore the search space more thoroughly on the global scale, while intensification concentrates on search in local scale by using the local information in the search process to generate better solutions. Thus, the current local information can be derivative of the target. Because the heuristic techniques often get stuck in the problem of local optima, meta-heuristic techniques were demonstrating most effective to avert this situation as mentioned in [66,78,79]. Meta-heuristic techniques are classified into two categories: swarm intelligence (SI) and bio-inspired. Bioinspired has penetrated into almost all areas of sciences, data mining, biomedical engineering, control systems, and parallel processing. There are many bio-inspired algorithms such as MA, GA, and imperative competitive algorithm (ICA). Swarm intelligence is a comparatively new technique to resolve the unconstrained optimization problems and is inspired from a social behavior of insect colonies and other animals such as PSO, ACO, artificial bee colony (ABC), glowworm swarm algorithm(GSA), BA, Firefly algorithm (FA), cuckoo search (CS), cat swarm optimization (CSO). Researchers always try to find better algorithms, especially for scheduling task in cloud computing. We present here a comparative analysis of these algorithms based on diverse optimization criteria that reinforce the intensification of Search space. Researchers in [80] presented an algorithm for independent task scheduling in grid computing by an amalgamation of PSO with the gravitational emulation local search (GELS) to avoid local minima problem. The amalgamation PSO-GELS algorithm shows a significant reduction in Makespan time. A novel PSO algorithm was presented in [81], it is based on a hyper-heuristic algorithm for secure tasks scheduling in the grid environment. The hyperheuristic algorithm reduces both Makespan and cost.
A task scheduling algorithm based on double-fitness adaptive algorithm-job spanning time and load balancing genetic algorithm (JLGA) is introduced by authors in [82] for distribution of load between VMs, energy reduction and minimization of makespan time. This algorithm uses a greedy algorithm to initialize the population. It takes crossover and mutation for adaptive probabilities instead of fixed value. In [83], authors propose a hybrid PSO ( HPSO) which is an amalgamation of PSO and TS algorithms. HPSO provides the local search technique by Tabu Search. HPSO enhances randomly generated population by separating it into two equal parts. Part one is improved using PSO, and another part with TS. Then, merging them again into one part to exchange global and local best position of the particles. HPSO minimizes the makespan and optimizes utilization of the resources. Raghavan et al. [84] solved workflow scheduling problem in the cloud using bat algorithm which gives better results of cost processing compared with Best Resource Selection (BRS) algorithm. To enhance the intensification of search space, an amalgamation of PSO and CS algorithm is presented in [78]. The hybrid PSOCS algorithm achieves good resource utilization and makespan reduction for independent task scheduling in cloud computing. The authors in [85] reduces the PSO precocious convergence and improves local search ability by using hill climbing algorithm after each iteration. The hybrid GHPSO algorithm works for discrete problems by using mutation and crossover strategies of a genetic algorithm. GHPSO is used in minimizing costs.
Researchers in [86] applied PSO to minimize the execution cost of running workflow application on the Cloud. PSO generates initial population randomly, while the proposed algorithm in [87] generates the initial population of particles based on shortest job to fastest processor (SJFP) algorithm. Researchers in [88] presented an amalgamation of PSO with the tabu search mechanism (PSOTBM) for independent task scheduling in cloud computing. The amalgamation PSOTBM shows a considerable reduction in energy consumption up to 67.5%. A Novel approach was presented in [89], it uses a family genetic algorithm (FGA) to increase resource utilization by effectively assigning VMs to the suitable physical machines. CSO-GA [90] is a combination of CSO and GA algorithms. This hybrid algorithm optimizes the makespan in comparison with other scheduling techniques. The researchers in [54] have proposed a novel ant colony based algorithm to reduce response time by balancing load via searching under loaded node. This algorithm uses FCFS for allocating the tasks to VMs. To produce optimal solutions for grid scheduling problem, the author in [91] used tree representation for GA solutions for mapping VMs and physical machines. Optimizing the energy saving and maximizing the revenues for service provider are described by authors [92], they presented multi-metrics genetic algorithm for independent tasks scheduling, such as makespan, cost, and energy efficiency. Researchers in [93] reduces the computational time of PSO and enhances the convergence rate by introducing an approach called MHPSO. MHPSO is a combination of mutation concept based PSO algorithm (MPSO) and stander hierarchical PSO algorithm (HPSO).
To balance the load and maximize resource utilization over hosts on data centers, authors in [94] presented novel power aware load balancing method called imperialism competitive algorithm-minimum migration time (ICA-MMT). This method reduces energy consumption in cloud computing data centers. In [95], authors combined bee colony and PSO algorithms the proposed approach is called parallel bee colony optimization particle swarm optimization (PBCOPSO). PBCOPSO shows a significant improvement in minimizing the makespan and maximizing resource utilization. A novel load balancing algorithm is introduced in [55] based on genetic algorithm in which independent tasks scheduling are addressed. This approach provides a load balancing and an efficient utilization of resources. Population of particles is initialized in [96] to provide an efficient utilization of resources and complete the tasks within a minimum period of time. A Hybrid approach is introduced by authors in [97] called FUGE that uses GA and fuzzy theory to perform optimal load balancing considering execution time and cost. Merging the imperialist competitive and local search optimization algorithms is the contribution of the authors in [98]. This algorithm addresses the reliability issue as well as makespan. This algorithm is compared with ant colony optimization and genetic algorithms and showed better performance. Load balancing approach [99], is introduced based on genetic algorithm in cloud computing environment for balancing load between VMs and reducing dynamic VM migration. Many other works have applied particle swarm optimization to resolve the problem of task scheduling such as [56,85,86,100,101]. Authors in [53] described and evaluated a cloud scheduler based on ACO, they handled the problem of response time and balancing throughput on a private cloud when various cloud users are executing their experiments.
Hybrid algorithm GA-PSO is presented in [102], the GA-PSO selects VMs based on speed and workflow of the task. This algorithm improves the load balancing and reduces makespan and cost. Kamaljit et al. [103] proposed a novel context and load-aware Family genetic algorithm methodology for efficient task scheduling using modified genetic algorithm known as family genetic algorithm.
Researchers in [104] proposed an algorithm based on bat algorithm (BA) for solving workflow scheduling problem in cloud computing with an objective of reducing the makespan. They implemented the BA in MATLAB and compared results with two popular existing algorithms, namely CSO and PSO. S. A. Hamad and F. A. Omara [105] have proposed task scheduling algorithm based on a modified GA. They overcome the limitation of the population size by using the tournament selection method to select the best chromosomes. Researchers in [106] presented a static task scheduling technique based on PSO algorithm.
They improved PSO using honeybee load balancing technique to increase resource utilization and decrease makespan. The hybrid task scheduling method in [107] uses PSO and hill climbing algorithms.
In this algorithm, the initialization of a population is randomly distributed using PSO. Then, selection of some particles to apply for hill climbing. This technique optimizes the makespan. Priority-based task scheduling in [108] called HGPSO algorithm combines the PSO and GA algorithms. In HGPSO, the tasks are arranged based on priority queue first, then the HGPSO algorithm is applied. The HGPSO performs good in terms of completion time, scalability, and availability compared to genetic and particle swarm optimization algorithms. Researchers in [109] presented a HTSCC Algorithm by combining the strengths of GA and PSO algorithms to increase resource utilization and decrease makespan. The HTSCC algorithm is implemented and simulated using CloudSim simulator. The simulation results show that HTSCC algorithm outperforms the GA and PSO algorithms by decreasing the makespan and increasing the resource utilization. In [110] the researchers presented MSDE algorithm depending on improving the performance of the Moth Search Algorithm (MSA) using the differential evolution (DE) to design a task scheduling model and global optimization problem.
To handle the starvation problem the researchers in [111] proposed a hybrid shortest-longest scheduling algorithm. They considered the capabilities of each VM and the length of the task to allocate the tasks to the most convenient VMs, so as to overcome the starvation problem and also satisfying and considering both the provider and user requirements. New hybrid QoS-based task scheduling algorithm is introduced in [112] to schedule dependent and independent tasks in a cloud environment. This work can be extended to implement the hybrid task scheduling algorithm in an effective way using cost involved in communication and energy efficiency.
The comparison of these algorithms is summarized in Table 2. The comparison considers the optimization metrics, nature of task, experimental scale and simulation environment.

Discussion
In this section, we provide an analysis and discussion of meta-heuristic techniques for scheduling tasks in cloud computing. The discussion is based on optimization criteria and classification of meta-heuristic techniques.
These techniques are analysed with regard to different optimization criteria as mentioned in section 3.4. Figure 5 depicts different criteria considered by different scheduling techniques. The mainly used criteria for scheduling is observed to be the makespan (33%), cost (18%), load balancing (16%), deadline (9%), and energy efficiency (9%). Less attention is given to budget (4%), throughput (4%), and resource utilization (7%). Some of these criteria are preferred by the user and others by the providers. Makespan, cost, budget, and deadline are preferred by the users, while resource utilization, throughput, load balancing, and energy efficiency are preferred by the providers. Minimization of makespan is an important factor for cloud users desire to speed up the execution of their applications.
Decisions in scheduling algorithms mostly based on the makespan metric of the applications. Thus, minimizing the makespan become the major concerns of the researchers to enhance the performane. Moreover, Cost comes next in the researcher's interest. Various types of costs have been considered in the literature such as data storage, data transfer, data renting, communication and computation costs. Table 3 Tasks   DEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  INDEPENDENT  DEPENDENT  DEPENDENT   Since scheduling tasks to resources spreads over different data centers from many cloud service providers, this increases data transfer cost and complicates data management. Also, the cost is increasing, if data are stored on a permanent storage. On the other hand, the cost is negligible if data storage is in a single data center on the same cloud provider or shared between multiple applications.
Resource utilization and load balancing of virtual machines are of the most significant factors considered by service providers.
Switching off the VMs results in decreasing the consumption of energy in data centers which is rather an important factor for service providers. Only few researchers have considered the budget and throughput in tasks scheduling. Both of them scores only 4% form the total number of approaches as depicted in Figure 5.
Meta-heuristic techniques such as PSO, GA, ICA and ACO are the mostly addressed techniques. Meta-heuristic techniques are classified into swarm intelligence and bioinspired approaches, see Figure 6. In the state of swarm intelligence approaches, PSO is the most widely used approach by the researchers in the cloud scheduling problems domain. Correspondingly in bio-inspired approaches, GA is the most widely used approach.

Approach
Types of cost [113] DATA RENTING COST. [114] DATA STORAGE, DATA TRANSFER COST. [86,115,116] COMPUTATION, TRANSFER COST. [117] DATA STORAGE, COMPUTATION, DATA TRANSFER COST. [118] COMMUNICATION, COMPUTATION, TRANSFER COST. [51,[119][120][121] COMPUTATION COST. [122] COMMUNICATION, COMPUTATION COST.  The frequency of optimization criteria based on nature of the task is depicted in Figure 7. The figure shows the deadline, throughput, and budget for dependent tasks and makespan, load balancing, energy efficiency and resource utilization, for independent tasks. The Meta-heuristic techniques are intended for global optimization, though they are not always successful or efficient. So, the researchers combine them with other techniques to improve the efficiency. The intensification and diversification are the main components of meta-heuristic techniques. They enable researchers to explore the search space in the global and local region. The perfect amalgamation of these components helps to obtain global optimality. But this amalgamation needs a lot of techniques to be explored. Most of the researchers implement the hybrid meta-heuristic techniques using CloudSim simulator [37] to simulate the performance of the optimization criteria.

Conclusion and future work
In this paper, we introduced the main concepts of scheduling and resource management in cloud computing. Moreover, we presented a comparative analysis of metaheuristic scheduling techniques in cloud computing by considering the optimization criteria, natures of tasks, user and provider desire, and simulation environment. According to the reviewed literature, we concluded that most of the works are done based on the popular metaheuristic techniques in cloud computing such as PSO, GA, ACO algorithms. The most used techniques for scheduling are observed to be the PSO in swarm intelligence and GA in bio-inspired. Moreover, there are EAI Endorsed Transactions on Cloud Systems 11 2019 -05 2020 | Volume 6 | Issue 17 | e4 other algorithms like ICA, CSO, BA, ABC have been used in task scheduling but less so than other algorithms.
Finally, we conclude that the makespan is the mostly studied criterion in the literature. For future work we are planning to come up with a new model for failure handling using hybrid meta-heuristic techniques.