Ant Colony-based Tabu List Optimization for Minimizing the Number of Vehicles in Vehicle Routing Problem with Time Window Constraints

The Vehicle Routing Problem consists in finding a routing plan for vehicles of identical capacity to satisfy the demands of a set of customers. Time window constraints mean that customers can only be served within a pre-defined time window. Researchers have intensively studied this problem because of its wide range of applications in logistics. In this paper, we tackle the problem on an economical point of view with a focus on capital expenditure (CAPEX), where the minimization of the number of vehicles is more important than the total traveling distance. This customization finds its applications in scenarios with limited CAPEX or seasonal/temporary operations. In these cases, the CAPEX should be minimized as much as possible to reduce the overall cost of the operation, while satisfying time window constraints. We provide an Ant Colony Optimization-based Tabu List (ACOTL). We test the proposed approach on the well-known Solomon’s benchmarks. We compare experiments results to Dynamic Programming on small size instances and later to the best-known results in the literature on large size instances. ACOTL allows to reduce the number of vehicles used sometimes up to three units, compared to the best-known results, especially for instances where customers are geographically in clusters randomly distributed with vehicles of low or medium charges.


Introduction
Every year, many logistic companies for delivery receive huge incomes from distribution processes. Distribution processes play an important role in supply chains, since almost half of the total supply chain cost comes from transportation processes [1]. For this reason, the management of distribution processes is critical in minimizing total supply chain cost. According to Toth and Vigo, transportation cost represents between 10 and 20 percent of goods' prices on the market, and computerized procedures based on optimization techniques permit make savings of around 5 to 20 percent on this transportation cost [2]. Transportation cost includes capital expenditures (CAPEX), that depends mainly on the cost and the number of vehicles, and operational expenditures (OPEX), that depends mainly on the total distance traveled by the set of vehicles. The determination of the number of vehicles and the route for each vehicle is known as the Vehicle Routing Problem.
Vehicle routing problem (VRP) is a common name associated to a class of combinatorial problems involving sets of customers that should be served by several vehicles [3]. The VRP can model various real-life problems, linked in supply chain management in the physical delivery of goods and services, such as postal deliveries, school bus routing, recycling routing and so on [4]. There are several variants of this problem. These are formulated based on the nature of the transported goods, the quality of service required and the characteristics of the vehicles and the customers.
In real-life scenarios, an important characteristic of customers is the time window during which a customer can be served. This characteristic extends the classic VRP problem to the well-known Vehicle Routing Problem with Time Window constraints (VRPTW). In VRPTW, routes must contain all the points (customer locations). Each point is visited within its time window by a single vehicle. Each route is associated to a vehicle and starts and ends at the depot. In addition, the total demands of all points on a route must be less than or equal to the capacity of the vehicle. Figure 1 presents an example of a VRPTW solution involving a depot and eight customers (nodes 1 to 8).
Approaches proposed to solve VRPTW usually try to optimize both CAPEX (number of vehicles) and OPEX (total traveled distance). But in some scenarios, the CAPEX is limited and the reduction of the number of vehicles of just one unit can make the set of vehicles affordable for the company. Moreover, some distribution processes can only be performed during a period or season because of the environmental conditions or the availability of products. This means reducing the CAPEX can drastically reduce the overall cost of the operation. This paper focuses on such scenarios, which are usually observed during agricultural campaigns, mainly in sub-Saharan Africa. To tackle this issue, an Ant Colony Optimization-based Tabu List approach is proposed, which is a combination of two well-known optimization approaches.
The rest of the paper is organized as follows. Section 2 briefly presents related works on VRPTW. The problem formulation is defined in Section 3; followed by the presentation of the proposed Ant Colony Optimizationbased Tabu List approach in Section 4. Section 5 presents simulation results and the comparison with the best-known solutions in the literature, before ending with conclusions and future directions.

Related Works
The Vehicle Routing Problem with Time Window constraints is classified as a NP-hard combinatorial optimization problem [5]. Consequently, approaches based on meta-heuristics are habitually used for larger instances of the VRPTW.
Most researchers model VRPTW as a multi-objective optimization problem with the aim of minimizing both the number of vehicles and the total travelled distance [6]; while others consider minimizing the number of vehicles as the primary objective like in [7]. In general, a two-phase approach is proposed starting by the minimization of the number of vehicles and ending by the minimization of the total traveled distance with a fixed number of routes.
Other works proposed new objective functions in VRPTW, including the minimization of the total waiting time [8].
Meta-heuristics are usually developed for solving the multi-objective VRPTW since the problem is NP hard. They work on a set of candidate solutions which require a high computation cost, depending on the size of inputs, to achieve high performance in VRPTW. More details are available in [9].
Several population-based approaches have been developed to VRPTW, such as Genetic Algorithms [2] and Artificial Bee Colony [10]. Ant Colony Optimization has been applied to Long-Distance VRP [11] and an Improved Ant Colony Optimization for Multi-Depot Vehicle Routing Problem is found in [12]. Authors in [13] proposed an incremental route building and an enhanced algorithm to tackle the VRP with soft time windows.
Some researchers tried to provide exact approaches such as restricted dynamic programming to solve VRPTW [14]. But this approach solves only small VRP instances because of the NP-hard property.

Notations
We will use the following list of notations to represent the problem formulation. We assume a 0 = 0 and b 0i = 0, for all k. Let = ( ; ) an undirected graph where = { ; = 0, … , } denoting a depot ( 0 ) and customers ( ; = 1, … , ). A non-negative demand and service time are associated with , with 0 = 0 and 0 = 0. E is a set of arcs with non-negative weights (which often represents distance) between and , � , � , < .
It is often assumed that it is symmetrical and satisfies the triangular inequality i.e., = . All customer demands are served by a set of K vehicles. At each customer , the starting of service time must be in the time window [ ; ], where and are the earliest and latest time to serve . If a vehicle arrives at at time < , a waiting time = max {0; -} is observed. Consequently, the starting of service time = max { ; }. Each vehicle of a capacity Q travels on a route connecting a subset of customers starting from 0 and ending within a schedule horizon [ 0 ; 0 ], corresponding to the earliest time of exit from the depot and the latest time of return to the depot.

Model
The objective function of the model is: K Subject to the following constraints: As defined in [20], constraint (1) ensures that for each vehicle starting its tour from the depot. There is exactly one outgoing arc from this node. Similarly, the constraint set (2) guarantees that for each vehicle k, ending its tour to the depot (I=0), there is exactly one entering arc into the node. Both constraints (1) and (2) together guarantee a complete tour for each vehicle. Constraint (3) ensures that from each node only one arc is outgoing for each vehicle. Constraint (4) makes sure that for each node j, only one arc is incoming for each vehicle. Constraints (3) and (4)  vehicle, the total demand of customers assigned to it does not exceed its capacity. The constraint (7) sets the arrival, waiting and service times at the depot to zero for each vehicle. The constraint (8) ensures that the sum of the arrival and waiting times at each node and for each vehicle is within the time window (between the earliest arrival time at that node and latest arrival time), = 1,2,3,···, . The constraint (9) ensures that the arrival time of each vehicle to each node j is not greater than the specified arrival time at that node. Constraint (10) ensures that the total traveling time of each vehicle is not greater than the maximum route time allocated to that vehicle. This is done to avoid any uncompleted tour.

Basic Ant Colony Optimization algorithm
Ants can solve complex problems collectively, such as finding the shortest path between two points in a rugged environment. For this, they communicate with each other locally and indirectly, thanks to a volatile hormone called pheromone. In fact, during its progression, an ant leaves behind a trace of pheromone which increases the probability that other ants passing nearby choose the same path using the receivers in their antennas [19,21]. This collective problem-solving mechanism is at the origin of algorithms based on artificial ants.
The first ant-based algorithm, called Ant System, was proposed by Marco Dorigo in 1992 [22], and its performances were initially illustrated on Traveling Salesman Problem. Thus, various improvements have been made to the initial algorithm, giving rise to different variants of Ant System, such as ACS (Ant Colony System) and MMAS (MAX -MIN Ant System) [23,24] which get in practice competitive results.
Many works on ant colony optimization have been inspired by MMAS algorithmic scheme. According to ACO meta-heuristic, at each cycle of the algorithm, each ant builds a solution. These solutions can be improved by applying a local search procedure. The pheromone traces are then updated. Each trace is "evaporated" by multiplying it by a persistence factor ρ between 0 and 1. A certain quantity of pheromone proportional to the quality of the solution, is then added to the components of the best solutions (the best solutions built during the last cycle or best solutions built since the start of the execution).
Among the problems strictly related to the one considered in this paper, the first one to which this method has been applied is the Traveling Salesman Problem (TSP) [25]. Then several other algorithms have been proposed for VRP [26] and VRPTW [27].

Ant Colony Optimization-based Tabu List
The proposed approach enhances the basic Ant Colony Optimization with an additional feature: the Tabu List. The main idea behind the approach is the following: Each time an ant m needs to move to the next city, a random search function is called to select a new city. Then the total traveling time is computed to check whether it is possible to move to that city and come back to the depot.
• If the move is possible then the city is visited, and the Tabu List is reset.
• Otherwise, it is considered as a prohibited city and stored in the Tabu List to avoid being selected once again at the next call by the search function during the same iteration. After multiple unsuccessful tries (the time window constraint is not satisfied), ant returns to depot (node 0).
The number of tries is defined by the parameter which is reset at the beginning of an iteration and each time a selected node can be visited.  : logical row matrix containing 0 for visited nodes and 1 for not visited.
: indexes of target nodes randomly selected. The flowchart of the proposed approach is provided in Figure 2. It can be decomposed into five main steps. The complete algorithm is provided in Algorithm1. Step 1: Initialization (Algorithm 1)

Algorithm Explanation
1. Lines 2 to 6: Parameters values are defined in section 4.2 with D the matrix distance between nodes and N the number of customers as presented in section 3.
• ACO is divided into two main phases, which are ant's route construction and the pheromone update [28].
2. Before route construction (Algorithm 2), at each iteration, all ants are located at the depot. The set of demands i of cities is known beforehand. All cities are set as unvisited.
Step 2: route construction (Algorithm 2) 3. Lines 6 to 15: At each construction step of the 'route', each ant m at node − 1, applies a probabilistic rule to the next node to visit. The choice of moving to a node depends on two values: heuristic function Heu_F and the level or the rate of pheromone on the arc (i, j) according to (11) is a moving rule called "probabilistic random proportional rule". It is the probability that an ant moves from node to node , which belongs to a set of nodes that are not yet visited by the ant m.

Lines 16 to 19: when a city j is chosen according
to the moving rule, some computations are performed: the travel time of the ant from node to a randomly chosen node ( ), the waiting and services times, and the travel time of the ant m from node to the depot 1 ( 1 ). In case the new vehicle load equals zero, the vehicle returns to the depot. 9. Lines 13 to 15: if the need of a node is higher than current vehicle load, this node is partially served, and the vehicle returns to the depot.

Lines
This phase is repeated × times with the condition that each ant m is a solution, each solution encompasses k tours, each tour starts and ends at the depot, and each node must be visited only once with respect to time window constraints.
Step 4: minimal vehicle number (Algorithm 1 and  4) 10. Lines 7 to 9 (Algorithm 1): for each iteration, when all the ants have built their solutions, for each solution, the number of vehicle is determined using Algorithm 4, and the result is saved in NV set .
11. Lines 10 to 23 (Algorithm 1): At the first iteration, minimal value NV is determined and saved in BestNV. At the following iterations, minimal NV value of current solutions is compared to BestNV. In case NV is smaller than BestNV, the latter is updated consequently.
Step 5: pheromone update (Algorithm 1 and 5) 12. Line 24 (Algorithm 1): at each iteration, after each solution has been constructed and minimal NV value has been found, each ant deposes a quantity pheromone on its path depending on delta that is computed using Algorithm 5.
If the edge ( , ) is included in the route of the ant , the quantity of pheromone deposited on this path is 13. At the next iteration +1, the quantity of pheromones on the route of each ant after evaporation is given by (13).
To neglect all the bad solutions obtained, and thus avoid convergence towards local optimum, the concept of evaporation of the pheromone tracks is simulated through a parameter called the evaporation rate (0 < <1).

Parameters and Datasets
The algorithm has been implemented in MATLAB 2018 and tested on the well-known Solomon benchmark composed of six datasets (C1, C2, R1, R2, RC1, RC2) [29]. Each dataset contains 08 to 12 instances of 100 customers with their respective locations and demands. Customers are grouped into clusters in C1 and C2, while they are randomly distributed geographically in R1 and R2. RC1 and RC2 are combination of both previous distributions.
The results of ACOTP algorithm are compared with that of DP and that of best-known results from the literature. Table 1 presents the important information of the datasets used in this study, and Table 2 shows the parameter settings.

Results and discussions
Comparison of results are performed in two phases. We first compare the performance of ACOTL with DP on small datasets. Secondly, we compare the results of ACOTL to best-known results.

Comparison of performance ACOTL vs DP
In this phase the performance of on instances of size 15 is compared with the one of on the same reduced instances.
is run once while is run twenty times per instance. Table 3 shows results of and on datasets 1 , 2 , 2 , 1 , and 2 . Both algorithms provide the same number of vehicles except on 201 instance where provide the smallest number of vehicles. This result shows the efficiency of and proves that cannot optimally solve some problems. In fact, it has been proven that cannot optimally solve the longest path problem. In fact, in this paper, we assume that determining the longest path of each vehicle can reduce the number of vehicles. This justifies the fact that cannot always solve the problem to the optimality.

Comparison of performance ACOTL vs Bestknown results
In this phase the performance of on instances of size 100 is compared with of best-known results [30].
is run independently 30 times on each instance of datasets. For the rest of instances, there is a difference of one vehicle.
On dataset 2 , ACOTL provide a number of vehicles equals to the best-know result just for the instance 208 .
Apart from that instance ACOTL provides a greater number of vehicles than the best-known results on 2 , as presented in Table 5. Table 6 compares the results of on the dataset 2 to best-known results. presents the same number of vehicles to the best-known results on most the   Best -known results  3  3  3  3  3  3  3 3  Best-known results 14 12 11 10 13 11 11 10   Table 7 outperforms previous best-know results on all the instances from datasets 1 . In fact, ACOTL can reduce the number of vehicles by decreasing by up to three unit, the best-known result.
Finally, Table 8 compares the results from to best-known results on instance 2 . ACOTL provides the same number of vehicles as the best-known on three instances and improve the known result of the instance 208 . However, its number of vehicles is greater on half the sample. Figure 3 recapitulates the best results of ACOTL. From this figure, the proposed approach is suitable for scenarios with customers geographically located in clusters randomly distributed and vehicle with low or medium charges.

Conclusion
This paper has tackled the Vehicle Routing Problem with Time Window constraints from an economical point of view in which the CAPEX (in terms of number of vehicles) should be minimized. We proposed a new meta-heuristic called Ant Colony Optimization-based Tabu List to minimize this number of vehicles. Experimental results showed that the proposed approach can reduce by up to three the number of vehicles in some instances of the Solomon Benchmark, especially in RC1 dataset. In general, ACOTL performs well in scenarios with customers geographically located in clusters randomly distributed, especially with vehicles bearing low or medium charges.
However, the approach still needs an improvement for some instances in C and R datasets. A possible enhancement can be the introduction of some operators in the combination of ACO and the Tabu List, to extend the number of cities an ant can visit. For instance, a swap operator can be used to replace a city in the current solution by one or several cities in the Tabu List. But this approach will require a customization of the inner features of the ACO approach.