Application of the multi-objective cross-entropy method to the vehicle routing problem with soft time windows

The vehicle routing problem with time windows is a widely studied problem with many realworld applications. The problem considered here entails the construction of routes that a number of identical vehicles travel to service different nodes within a certain time window. New benchmark problems with multi-objective features were recently suggested in the literature and the multi-objective optimisation cross-entropy method is applied to these problems to investigate the feasibility of the method and to determine and propose reference solutions for the benchmark problems. The application of the cross-entropy method to the multiobjective vehicle routing problem with soft time windows is investigated. The objectives that are evaluated include the minimisation of the total distance travelled, the number of vehicles and/or routes, the total waiting time and delay time of the vehicles and the makespan of a route.


Introduction
The vehicle routing problem (VRP) remains one of the most studied problems in the field of Operations Research. It has many real-world applications but exact methods require a considerable amount of computational time. In this paper, we consider the multiobjective optimisation (MOO) version of the vehicle routing problem with time windows (VRPTW), and focus on the VRP with soft time windows variant (VRPSTW). In this problem type, a number of vehicles have to provide a service to customers at different locations while adhering to constraints with regard to the capacity of the vehicle and the time window in which service should start. Although the problem has been considered as a multi-objective problem by a number of authors, the focus has primarily been on minimising the number of vehicles and the total travel distances. This paper considers these and other pairs of conflicting objectives. Our research aims were twofold: 1) we show that the multi-objective cross-entropy method (MOO CEM) can be applied to the VRPTW and 2) we provide reference solutions to a new set of benchmark problems recently developed by Castro-Gutierrez et al. (2011). Since the benchmark set is new, and we could not find reference solutions at the time of writing, we claim that the solution sets presented in this paper serve as a first reference set for OR practitioners.
We assume the reader is acquainted with MOO; more detail can be found in Coello Coello (2006). MOO almost always returns a set of two or more good solutions as opposed to single-objective optimisation, in which the optimum is a single solution. The decision maker still has to choose a specific solution from a set of MOO solutions. This set is often referred to as the "Pareto optimal set".
The paper is structured into the following sections: it starts with a brief overview of the VRPTW and its formulation followed in the next two sections by an overview of the field of MOO in vehicle routing, and the nature of the CEM and its application to combinatorial optimisation. The formulation of the multi-objective problem model and the basic structure of the algorithm are then explained, followed by the presentation of the results.

The Vehicle Routing Problem with Time Windows
In logistics and distribution, decision makers are often faced with the problem of developing optimal routes for vehicles that service different customers. The vehicle routing problem is considered to be a variation of the travelling salesperson problem where one salesperson has to visit a certain number of cities before returning to the home city. Also termed the "truck dispatching problem", Dantzig & Ramser (1959) considered it a generalisation of the travelling salesperson problem. In the vehicle routing problem, a number of vehicles need to be routed to geographically dispersed nodes or customers. In addition, vehicles have limited capacity, which places a restriction on the number of nodes that one vehicle can visit. The vehicles also perform a service at the different nodes. The VRP has evolved into different subproblems, for example the VRP with stochastic demand and the VRPTW, which are more realistic representations of realworld problems. The VRPTW has many applications, such as the routing of buses and trains, bank deliveries and postal deliveries.
The problem under consideration in this study is the VRPTW. In this problem, the time in which a vehicle may arrive to begin service at a certain node is limited to a certain time window. The VRP with soft time windows further implies that vehicles can arrive after the time window has closed, although this is often associated with a penalty cost for late arrival.
In the VRPTW, a set of vehicles with limited capacity is to be routed from a central depot to a set of geographically dispersed customers with known demands and predefined time windows (Tan et al., 2006). Toth & Vigo (2002) summarise the concept of the VRPTW, as applicable to this article, as follows: 1. Each route visits the depot vertex.
2. Each customer vertex is visited by exactly one route (within the specified time window).
3. The total demand of the customers visited by a route does not exceed the capacity of the separate vehicles.
The addition of time windows increases the complexity and computational intensity of the problem. The VRP is classified as an NP-hard problem and consequently the VRPTW is a constrained NP-hard problem. Taillard et al. (1997) claim that in the relaxation of soft time windows feasible solutions are easier to find as there are fewer hard constraints, but further state that this is countered by the way hard time windows in turn allow for infeasible solutions to be filtered out fairly quickly. Kallehauge et al. (2005) define the VRPTW in mathematical terms with a fleet of vehicles, V, a set of customers C and a directed graph G. N is the set of vertices, 0, 1, ..., n+1 with 0 and n + 1 representing the depot (respectively the starting and returning depot). Define x ijk as Further definitions are the capacity of each vehicle (C k ), the demand of each customer i (D i ), the cost (or distance) c ij and time (t ij ) associated with each arc (i, j) where i = j. The time window [a i , b i ] is associated with each customer. In the case of hard time windows the vehicle must arrive at the customer before b i and in the case of soft time windows a delay time is logged. If a vehicle arrives before the time window starts, it incurs a waiting time until a i when the service can start. The variable s ik denotes the time at which vehicle k starts service at customer i. This is defined for every vehicle k and customer i, but becomes irrelevant if vehicle k does not service customer i. It is assumed that the time window of the depot always starts at zero and no service is required, i.e., s 0k = 0, and the time back at the depot (although no service is required) is defined as s (n+1)k .
We adapted the mathematical model of Kallehauge et al. (2005) for the multi-objective VRP with soft time windows as shown in (2a) to (2e).
The optimisation model, using (1), is Five conflicting objectives are defined in (2a) to (2e) and optimised in pairs. The five objectives are shown in Table 1 with the labels defined by Castro-Gutierrez et al. (2011). The model differs from that of Kallehauge et al. (2005) in that the adaptation for soft time windows in (3g) calculates the total time a vehicle waits for a time window to start on a route (t k w ), and (3h) calculates the total delay time of customers on a route waiting for vehicles that arrive after the close of a time window, denoted by t k d . The other constraints follow the original model. Constraint (3a) ensures each customer is visited exactly once, (3b) is the capacity constraint and (3i) the integrality constraint. Constraints (3c), (3d) and (3e) ensure that each vehicle leaves the depot, arrives at a customer and then proceeds to the next customer, and that all vehicles end at the depot. We excluded sub-tour elimination constraints for brevity. This model with the soft time windows will be used as platform for further analysis.

Multi-objective Optimisation and the VRP
In this section, we mention the work of a few researchers in the field of MOO and the VRP, followed by a discussion of benchmark problems in the research field.
The VRP is generally viewed as having the single objective of minimising the distance travelled by the different vehicles. However, over the past few years the problem has been considered as being multi-objective, with objectives including the distance travelled, average lateness and the number of vehicles. Jozefowiez et al. (2008) review the field of multi-objective vehicle routing problems. They identify three uses of multi-objective vehicle routing problems: an extension of classic academic problems in order to improve their practical applications, general classic problems and the study of real-life cases where the decision maker identified the objectives. Jozefowiez et al. (2008) further group the methods used to solve the multi-objective problems into scalar methods, Pareto methods and a third category that considers different objectives separately. While it is possible to generate a weighted cost function with regard to two of the objectives in order to use scalar methods, this would generally result in a bias towards one of the objectives. The field of MOO has been developed to display the trade-off between objectives in an objective way. When considering the variation of the VRP with time windows, it appears that in the field of Pareto methods, evolutionary algorithms are used in most cases. Tan et al. (2006) propose a hybrid multi-objective evolutionary algorithm and Ombuki et al. (2006) use a multi-objective genetic algorithm. Geiger (2008) states that the relaxation of the time window restriction (VRPSTW) allows for a more practical multi-objective formulation and investigates the influence of this relaxation and other problem characteristics on genetic operators in evolutionary algorithms. Recently Garcia-Najera & Bullinaria (2011) proposed an improved multi-objective evolutionary algorithm which uses a similarity measurement to enhance the diversity of solutions. When compared to general evolutionary methods such as the popular NSGA-II (Deb, 2001) this method shows improvements in particular in preserving high diversity before settling on a solution. Garcia-Najera & Bullinaria (2011) further demonstrated that 26 instances of Solomon's benchmark set exhibit conflicting objectives, and emphasized the use of a Pareto front.
Turning to benchmark problems, we found that, in addition to providing algorithms for solving the VRPTW, Solomon (1987) also developed six sets of benchmark problems that have since been used in comparing different methods. Although the Solomon benchmark set has been extended and used in most literature on multi-objective vehicle routing, Castro-Gutierrez et al. (2011) found that classic test instances such as these problems developed by Solomon are not entirely suitable for MOO. The objectives used by Garcia-Najera & Bullinaria (2011) (the number of vehicles and the total travel distance) were in fact found to be in harmony for most of Solomon's problems. The need for specific multi-objective test cases as opposed to the extension of traditional single-objective cases came to light and a set of problem instances were generated by Castro-Gutierrez et al. (2011) to address this need. Initial experiments with the evolutionary algorithm NSGA-II (Deb, 2001) showed evidence of multi-objective features such as a correlation value close to −1 or 1 for a given pair of objective values. In the case of minimisation, a correlation value close to −1 indicates a pair of objectives with a conflicting nature, i.e., the minimisation of one objective leads to an increase in the other objective. We studied the newly proposed MOO test set and thus provide some reference solution sets for researchers wishing to do further work in MOO of the VRPSTW. The MOO CEM was used, as it was recently proposed (Bekker & Aldrich, 2010). Castro-Gutierrez et al. (2011) identified five objectives to be used in vehicle routing (see Table 1): the number of vehicles (Z1), the total travel distance (Z2), the makespan (travel time of longest route) (Z3), the total waiting time when vehicles arrive before the time window (Z4) and the total delay time when vehicles arrive after the time window (Z5).

Combinatorial Optimisation and the Cross-entropy Method
Combinatorial optimisation refers to a problem where the decision maker seeks the combination of integer variable values that will optimise the objective function. Due to the combinatorial nature of the VRP, it is clear that the computational complexity of the problem increases as the number of vertices increases. The CEM is presented as being able to find a good local minimum for such a combinatorial optimisation problem. Rubinstein & Kroese (2004) devotes a chapter to the travelling salesperson problem (TSP). De Boer et al. (2005) also use the TSP as an example when illustrating the application of importance sampling to combinatorial optimisation. When comparing the CEM to other algorithms for combinatorial optimisation (simulated annealing, nested partitioning, tabu search, genetic algorithms), Rubinstein (1999) states that the CEM employs a global rather than a local search procedure. The interested reader is referred to Rubinstein & Kroese (2004) and Bekker & Aldrich (2010) for the mathematics supporting the CEM. A few basic concepts from the discrete case are repeated here for convenience (Rubinstein & Kroese, 2004).
Let X ∈ X be a random vector with probability mass function f (·, u) and distribution parameter vector u. Suppose a problem has a performance function S(x) with x ∈ X then the optimisation problem is maximise S(x) over x ∈ X (4) while its estimation formulation is Suppose the maximum of S over X is γ * , then The CEM requires that an estimation problem be associated with the optimisation problem of (6), and a collection of indicator functions , v ∈ V} be a family of probability mass functions (pmfs) on X that are parameterised by a real-value vector v.
To solve the problem associated with (6), assume u ∈ V and estimate the probability with f (x; u) being the probability mass function (pmf) on X and γ some chosen level. Suppose now γ is equal to γ * , then l = f (x * ; u), which is a very small probability, and with this is associated a rare event. The probability can be estimated with the variance reduction technique of Importance Sampling (Rubinstein & Kroese, 2004) by taking a random sample X 1 , . . . , X N from a different pmf g and estimate l viâ which is the unbiased importance sampling estimator of l. The distribution g(·) is introduced to make the probability of event l occurring "less rare". The optimal way to estimate l is to use the change of measure with a different pmf Since this optimal pmf is generally difficult to obtain and depends on the unknown l, one approximates it by choosing g such that the cross-entropy or Kullback-Leibler distance between g and g * is minimal. The Kullback-Leibler distance between two pmfs g and h is defined as Since I {S(x)≥γ} is non-negative, and the pmf f (of X) is parameterised by a finite di- Kroese, 2010). To estimate l with (8), v is chosen such that D(g * , f (·;ṽ)) is minimal.
For discrete random vectors X the components ofṽ will always be of the form (Alon et al., 2005) with A ⊂ B ⊂ X . This number can be estimated by taking a random sample X 1 , . . . , X N from the pmf f (·, v) and evaluate In the case of combinatorial optimisation problems such as the TSP and VRP, the probability distribution used in the importance sampling step is supported by a transition probability matrix. A matrix P is generated so that the probability of going from city i to city j is represented as p ij . The method starts with equal probabilities in the matrix, which are updated according to the routes in the current best solution set. This is defined by De Boer et al. (2005), in the case of the travelling salesperson problem, as The estimated value for γ is obtained by creating a random sample of possible tours with evaluated objective function values. These values are then ranked to estimate a sample upper quantile, for exampleγ = S N (X i ) and typically, 0.70 ≤ ≤ 0.95. The "target event" to be estimated by the method is denoted by S(X l ) ≤ γ in the case of minimisation (De Boer et al., 2005). In (14), the probability of visiting city j (the jth success) is updated by counting the solutions in the current set of good solutions (X l ≤ γ) with X l ∈ X ij , where X ij is the set of matrices where the transition from i to j is made (x ij = 1), in all the tours (l) generated as a population of size N in an iteration.
In (14), only routes with a performance value less than γ are considered. Rubinstein & Kroese (2004) recommend that the updated matrix P be smoothed to help prevent premature convergence, using The CE algorithm for optimisation of the VRP is based on Rubinstein & Kroese (2004), and is as follows.
1. Initialise the transition matrix P 0 . Typically, off-diagonal elements are assigned the value 1/(n + 1) for n vertices. Initialise the iteration counter t ← 1.
2. Generate a sample of tours by some method using P t−1 .
3. Compute the estimationγ t = S N (X i ) from the sample performances.
4. Update P t using (14) and the results of the previous step.
6. If γ t has converged, stop, otherwise t ← t = 1 and go to Step 2.
Bekker & Aldrich (2010) adapted the CEM for MOO and tests on benchmark problems showed satisfactory results as approximations of the true Pareto sets were obtained with a relatively low number of simulations. The method was applied to deterministic, continuous problems and discrete, stochastic problems. In this study, the applicability of the method on discrete combinatorial problems is considered by maintaining the core mechanism of the CEM. The multi-objective model formulation is explained in Section 5.
The expression in (14) is applicable to the singe-objective optimisation as applied by Rubinstein & Kroese (2004) and De Boer et al. (2005) with the set of best solutions (the elite set) defined as instances with a objective function value (S(X l )) smaller than γ. The ranking of the elite set in the case of MOO as proposed by Bekker & Aldrich (2010) and the subsequent construction of the probability matrix as used in this study are explained in Section 5 with the formulation of the optimisation model. Ma (2011) succesfully applied the CEM for single-objective optimisation to the VRPTW. A multi-agent environment was introduced where a vehicle-specific transition matrix is associated with each node of the network that is used to construct a feasible route for every vehicle (the agents in this case). As explained previously, it is this transition matrix that is constructed using a random mechanism. It is then updated according to the performance of the routes travelled by the vehicles, in effect increasing the probability of estimating the rare event of an optimum solution. Our proposed method employs the same principle, but uses a general transition matrix to construct routes for all the vehicles.
Ma (2011) further proposed a local search procedure on a subset of good solutions to avoid premature convergence by the CEM. The results on classic MOO test-instances by Bekker & Aldrich (2010) suggest that this is not necessary, but further research can investigate the influence of a local search procedure on the proposed multi-objective method in vehicle routing.
Chepuri & Homem-de-Mello (2005) applied the CEM with Monte Carlo sampling to the VRP with stochastic demand, using one vehicle. Generally, in this type of problem, there is variation in the set of customers visited, the demands and the travel times. Their application does not completely depend on specific problem formulations and can be extended to using multiple vehicles.
The applications of the CEM to the VRP are all based on single-objective optimisation, and we present a MOO application using the CEM next.

Model Formulation
The VRPSTW consists of a network of customers at different locations indicated by coordinates, distances and travel times between customers and a central depot from which a homogenous fleet of vehicles depart. The solution consists of a list of the routes travelled by the different vehicles, usually illustrated by the order in which vertices are visited. The problem was modelled with the following data structures: customer detail, distance and travel time between pairs of customers, and the proposed routes (candidate solutions).
The pseudo-code for the MOO CEM for solving the VRPSTW is shown in Algorithm 1, and the structure of the optimisation model is shown in Figure 1. An initial probability matrix (P ) with equal probabilities is defined to start the process. The probabilities are then used to construct routes in order to obtain a set of N solutions, with q the identifier (index) of the particular set of routes. This construction of routes and normalisation of the probability matrix is summarised in Algorithm 2. The solutions obtained in this manner are then evaluated and values are assigned to the pair of objectives being studied (in Figure 1, Z 1 and Z 2 are used). We ranked the solutions using the algorithm of Goldberg (1989), and the ranking value is stored in ρ q . The degree of domination of one vector relative to the other vectors in the solution set is indicated by this value. All vectors with ρ q ≤ t h after ranking of an iteration are weakly dominated and form part of the elite set of that iteration. The value of t h is a preset threshold value, typically 0 ≤ t h ≤ 2. If t h > 0, then weakly and non-dominated vectors are returned in the elite set. If t h = 0, then only non-dominated vectors are added to the elite set. This elite set of solutions is used to obtain the updated probability matrix as defined by (14) and explained in Section 4. The expression in (14) is adapted to allow for the multi-objective case shown in (15), where indicator values for every solution (q) with a number of routes (k) are summed in the population (N ) as opposed to the TSP (see (14)) where a solution consisted of one tour. This process is iterated until the stopping criterion is met.
The expression in (15) can be explained with reference to Figure 1. Suppose the solutions with indices 1, 2 and N end up as the three solutions of the elite. The denominator will be equal to 3, while the various x ij must be considered in the numerator. It can be seen that x 09 occurs in all three solutions, so p 09 = 3/3 = 1, but p 02 occurs in two of the three solutions, so p 02 = 2/3.
The algorithm applied to the vehicle routing problem with time windows is illustrated in Algorithm 1, and is based on Bekker & Aldrich (2010).
For the purpose of Algorithm 1 it is important to note that a rank of 0 denotes nondomination and accordingly the best solutions of the current set. The loop of steps 6 Probability Matrix Solutions Objectives j q = 1 i p 00 p 01 · · · p 0n p 10 p 11 · · · p 1n . . . . . . . . . . . .  Construct routes using Algorithm 2 and P t 9: Evaluate routes 10: Rank the solutions using the threshold t h = 2

11:
Update P t using (15) 12: Rank with t h = 1 15: L c ← L c + 1 16: until L c > N m 17: Rank with t h = 0 18: return elite set to 13 retains solutions with a rank of 2 to maintain solution diversity and to prohibit premature convergence. The non-dominated solutions are only isolated at the end of the algorithm, in step 17.
In principle, routes are constructed with Markov-chain transition probabilities but in reality the VRP is highly constrained and routes are dependent on the feasibility of said transition. Algorithm 2 ensures that the transition to non-feasible vertices is not possible by setting the transition probability to 0 and normalising the remaining probabilities. The matrix with transition probabilities is updated according to feasible vertices to be visited. From the array of feasible cities the next city to be visited is visited according to the row of P .

Methods and Results
The algorithm for the CEM was coded in Matlab and applied to the different benchmark instances of Castro-Gutierrez et al.

Benchmark problems
Castro-Gutierrez et al. (2011) generated a benchmark problem set that can be used for the multi-objective VRPSTW. The time windows and the demands of the customers are characterised in a certain way. The problems are characterised with the number of customers, the different time window profiles (0-4, 4 being the tightest) and capacity and/or demand constraints (0-2, 2 being the tightest). Each problem has a specific label, for example, "50 d0 tw1" denotes the benchmark problem with 50 customers, a demand/capacity profile number 0 and time window profile 1. Six pairs of objectives (from the five defined in Table 1)

Performance measures
The hypervolume comparison method yields a recognised unary indicator used in comparing two different Pareto sets in order to assess the difference in quality of two algorithms. A unary indicator in this context is a function that returns a single, real value from the set R. The hypervolume indicator (I H ) is also the only unary indicator capable of detecting that a set of solutions is not worse than another (Zitzler et al., 2003;Raad et al., 2011). This indicator is used to isolate the best result of 10 pseudo-independent tests of the MOO CEM for the VRP. To show the general performance of the method, the highest, lowest and average values for I H are documented for the 10 runs. Since we deal with pairs of objectives in this study, we shall further refer to the hyperarea indicator instead of the hypervolume indicator.
The indicator values of all problem sets on which the algorithm was tested are shown in Table 2. It is important to note that due to the difference in the order of the objectives, the hyperarea is a relational indicator, i.e., a high value in one column is incomparable to the seemingly lower value in another column. It is also interesting to note that in some cases (especially in the Z1 and Z5 pairing) the hyperarea indicator is equal to 0 for all the documented indicator values of particular problem cases. This shows that the algorithm performed the same for all 10 runs as there was no difference in the hyperarea of the final approximate Pareto fronts. This is applicable to discrete cases such as the number of vehicles (Z1) and in the case of Z4 and Z5 when an optimal solution (such as delay time = 0 or waiting time = 0) is found in all 10 iterations. In these cases the movement of the second objective is irrelevant as the hyperarea indicator depends on an area, and here the extreme point of this objective forms a line with the secondary objective, thus resulting in an indicator value of 0. Variances higher than 0.5 are indicative of answers that did not exhibit a final multi-objective front (i.e., one optimal point) but the "best" iteration provided a solution better than the other iterations, resulting in a large variation in the hyperarea indicator value.

Parameter setting
One objective of the research is to determine if the MOO CEM can in general be used for the VRPSTW, so the fine-tuning of parameters is not investigated in detail. However, before tests were performed on the benchmark problems, experiments were conducted to get an indication of good parameter values. The main parameters that influence results include the population size N , the smoothing parameter value of α, and lastly the maximum allowed number of iterations (τ and N m ), as explained in Algorithm 1.
A number of experimental tests were conducted and the averages of the I H indicator of five runs were computed from a common reference. In Figure 2 the population size was set at N = 2000 and the average I H of five runs at different values of α showed α = 0.9 performed the best. In Figure 3, the value of α = 0.8 was set while different values of N were tested. The difference in the indicator values for N = 2000 and N = 2500 was deemed not large enough to warrant the increase in computational time associated  best  2640  0  713970  3056874  66567600  284400  average  2484  0  630510  2398305  59207760  145440  worst  2220  0  551298  1336314  44132400  0  50 d2 tw3  best  420  87360  490368  57652992  7732800  25596000  average  342  87096  458721  49416319  5318280  15331680  worst  240  86040  376998  51309015  2350800  0  50 d2 tw4  best  2400  0  963108  2949780  82825200  14742000  average  2226  0  900396  1458745  50275080  5660280  worst  2100  0  821442  600840  29865600  0  250 d2 tw1  best  600  1033920   with N = 2500. Following these experiments and due to time considerations, tests were conducted with parameters set to N = 2000, α = 0.9, τ = 25 and N m = 10.

Results
Tests were performed on all cases of the 50 customer problems with conflicting objectives, but we discuss only one problem instance in detail in this section, namely 50 d1 tw4. This problem instance is chosen as a fair representation of results found across the different demand and time windows, and serves as a good platform to discuss the performance of the algorithm and the findings of the study.
The results are shown in subsequent figures, as follows: for each of the conflicting pairs of objectives, a figure showing the progression of the approximation front through the iterations (or loops) as explained in Algorithm 1 is presented (see e.g. Figure 4). This serves as indicator of the worth of the CEM, as results improve from the completely random construction of routes in iteration 1 to the final approximation front formed in iteration 10. The adjacent figure shows the final approximation front in the objective domain as presented to the decision maker (e.g. Figure 5). Lastly, a table showing the routes for the set of vehicles (V k ) of a particular solution point from the final approximation front, is presented. The tables (Table 3 to Table 8) list the reference numbers of the customers in the order that they were visited by each vehicle. These numbers follow from Castro-Gutierrez et al. (2011). In Table 3, for example, it is shown that six vehicles were used, and vehicle number 1 (V 1) visited eight customers. Also, there are exactly 50 non-zero labels in the

Discussion of results
For all six cases the progression of the approximation front is evident and the worth of the CEM of estimating good solutions is illustrated. In the case of the discrete objective Z1, the number of vehicles, Figures 5 and 7 show approximate Pareto fronts that are limited in their multi-objective nature and exhibit seemingly redundant points at the lowest number of vehicles. As this is only evident in the case of the discrete objective, it can be deduced that the algorithm found a number of solutions for Z3 for the same value of Z1, and cannot justify fewer or more values for Z1, as no points between discrete values can be obtained. In Figure 5 for example, one can see that using six vehicles in this problem yield a number of makespan values, of which 27 660 time units is the least    (Point A). The makespan can be reduced to 27 480 time units if the decision maker is willing to buy seven vehicles. In Figure 7, the best solution is shown by Point B, where the delay time is zero when using five vehicles.
Although Figure 15 exhibits a single-point solution (which is strictly not Pareto anymore), Figure 14 illustrates that the non-optimal solutions are inherently multi-objective. The progression of the front to the absolute optimal (the best combination for Z4 and Z5 is 0) is good, and the multi-objective nature of the objective pair is not refuted.
It is also valuable to note that in Figure 10 and to a lesser extent Figures 8, 12 and 14, the spread of the front becomes smaller as the algorithm progresses, and tends to converge to a limited region. This supports the finding in Section 6.3 that the number of 1,000 1,500 2,000 2,500  In the 250-customer problem, the set of problems with the tightest capacity profile ("250 d2 tw1") was computed to investigate the feasibility of the CEM for multi-objective vehicle routing on larger scale problems. The method proved to be able to solve these problems and results showed similiar characteristics to that of the 50-customer problems. The biggest difference can be seen in Figure 16 and Figure 17: in the 250 d2 problem objective Z3 exhibited the same behaviour as that of the discrete objective Z1,    the number of vehicles. This is due to the nature of the problem and time windows, limiting the makespan of the longest route to a fixed number of possibilities. Figures  18 and 19 serve as an example of the Z4 and Z5 pairing, exhibiting a multi-objective approximation front as also found in the other non-documented 50-customer problems.
MOO is generally descriptive and not prescriptive, and the decision maker still has to choose a solution for implementation from the approximation set. In the results set presented it is in some cases easy to choose a solution (Figures 7 and 15), but in other cases business constraints will be required to support a choice. In Figure 7, six or seven vehicles can be bought; if finances are limited, then six vehicles are indicated, but the makespan is penalised. In Figure 11, suppose the business wants to limit travel distance

Summary and Conclusions
The first aim of the research was to assess the possible application of the CEM for MOO to the multi-objective vehicle routing problem with soft time windows. It was found that the MOO CEM adequately obtains an approximation set that progresses towards the Pareto front of solutions for conflicting objectives of a given problem. Some of the benchmark problems converged to a final solution consisting of a single point in the objective space even though it initially consisted of a front of solutions.
The speed performance of the MOO CEM itself proved good but the route-construction algorithm (as explained in Section 2) is a definite bottleneck. Future work should look at streamlining the construction of routes with a transition probability matrix as required by the CEM. We also provide references sets of solutions for MOO of VRPs with soft time windows that OR practitioners can use and improve upon.
Our future work is to assess the applicability of other metaheuristics to the Castro-Gutierrez et al. (2011) problem set, and to compare the performance of the metaheuristics relative to another.