Heuristic approach to the passive optical network with fibre duct sharing planning problem SP

Similar to the constrained facility location problem, the passive optical network (PON) planning problem necessitates the search for a subset of deployed facilities (splitters) and their allocated demand points (optical network units) to minimise the overall deployment cost. A mixed integer linear programming formulation stemming from network flow optimisation is used to construct a heuristic based on limiting the total number of interconnecting paths when implementing fibre duct sharing. A disintegration heuristic is proposed based on the output of a centroid, density-based and a hybrid clustering algorithm to reduce the time complexity while ensuring close to optimal results. The proposed heuristics are then evaluated using a large real-world dataset, showing favourable performance.


Introduction
In accordance with the current exponential growth in telecommunication network bandwidth requirements, service providers are opting for optical fibre-based solutions for lastmile deployment.With fibre interconnects moving from the backbone to the access networks and the accompanying large capital expenditure, optimisation of these networks have become paramount.The main contender for these fibre-based access networks is the Passive Optical Network (PON) [5].A single optic fibre cable runs from a Central Office (CO) to a cabinet housing a passive distribution unit called an optic splitter.In the case of Fibre to the Home (FTTH), the optic signal is then distributed to a number of smaller fibres running directly to termination points known as Optical Network Units (ONUs) at customer premises.The PON topology is illustrated in Figure 1.Since these fibres are installed in subterranean ducts, expensive trenches need to be dug all the way from the CO to the customer.The problem is then to minimise the cost of connecting customer premises to the CO by choosing appropriate locations for these splitters and deciding which customers to connect to them.A number of papers address this problem using both discrete optimisation and metaheuristic techniques.Although Li and Shen [17] incorporated fibre duct sharing into their Random Allocation and Reallocation Algorithm (RARA) algorithm through the use of constrained minimum spanning trees, most authors ignore this fundamental part in the deployment phase of real-world networks.Since it would be impractical and expensive to dig a trench for each individual fibre, a single duct usually contains a number of fibres that share a part of their route to customer premises.
This paper uses concepts from network flow optimisation to incorporate fibre duct sharing into a mixed integer formulation of the Passive Optical Network Planning Problem (PONPP).The rest of the paper is organised as follows.Work related to PONPP are discussed in §2, before PONPP is defined more formally in §3.A mixed integer model is given in §4.The path and disintegration heuristics are given in §5 and §6.Results for the model are discussed in §7 before concluding the paper in §8.

Related work
PONPP has been studied for a number of years and authors typically take one of two approaches; one based on exact models combined with valid inequalities or heuristics and another based on meta-heuristics.Khan [12] developed a greedy algorithm, which is based on the transformation of a population density graph to one proportional to some Population Density Function.Minimum distance allocation is then performed on the transformed graph iteratively.Mitcsenkov et al. [23] provide a Capital Expenditure (CAPEX) model for PONPP, which they solve for very large problem instances using a tree-based heuristic algorithm known as the Branch Contracting Algorithm (BCA).Mitcsenkov et al. [22] provide a general methodology to design broadband infrastructure.Li and Shen [17] did a comprehensive study on greenfield PON planning, introducing a heuristic called RARA which sequentially refines a Mixed Integer Linear Programming (MILP) model solution through the use of simulated annealing.The model in question also takes into account the PON specifications in terms of network reach and differential distance.The survivable PONPP has been studied thoroughly by Kantarci et al. [10].They introduce three MILP models based on different service availability.These models were solved using a standard branch-and-bound algorithm before providing a greedy planning heuristic [11].The multilevel PONPP is studied by Kim et al. [13], providing bounds through linear relaxation as well as a two stage incremental improvement heuristic.Ouali and Poon [24] developed a basic PONPP model and solved small test instances using branch-and-bound.
A wide range of meta-heuristics have also been employed to solve PONPP, with genetic algorithms (GA) being the most popular.Poon et al. [25] used a GA to identify splitter locations, with a clustering heuristic to form PONs. Lv and Chen [20] and Kokangul and Ari [14] address the multi-level PONPP using a GA, while Villalba et al. [30] studied modified versions of PONPP with ring and bus topologies.Ahmad et al. [1] uses a GA to solve PONPP, but optimises for minimum power consumption instead of minimum deployment cost.Lakic and Hajduczenia [15] studied PONPP with the inclusion of nontraversable obstacles through the use of convex hull mapping, which is then again solved using evolutionary computing techniques.Finally, Xiong et al. [29] provides an algorithm that is less vulnerable to local optima using ant colony optimisation.
PONPP is conceptually similar to the Connected Facility Location Problem (ConFL), first introduced by Gupta et al. [9] and studied by a number of authors since.Starting from ConFL, including trenching cost as an additional fixed cost assigned to every edge and substituting facilities and demand points with splitters and ONUs respectively, results in the PONPP.Swamy et al. [26] provided a primal-dual 8.55-approximation algorithm for ConFL while Gollowitzer and Ljubić [8] did a polyhedral and computational study on a large number of formulations.Arulselvan et al. [2] introduced a multi-period incremental formulation of the ConFL along with cover and cut-set inequalities solved using branchand-cut.The hop constrained ConFL was illustrated by Ljubić and Gollowitzer [18] using a cut formulation on layered graph approach while Leitner et al. [16] studied the twoarchitecture ConFL and provided a cut formulation Integer Linear Program (ILP).Finally, Bley et al. [4] studied the survivable constrained ConFL problem and solved a number of small instances using Bender's decomposition in a branch-and-cut framework.

Problem definition
Assume an undirected graph G = {V, E} is given with edge costs c e ≥ 0, e ∈ E, a disjoint partition J = {S, U} with S ⊂ V the possible splitter locations and U ⊂ V the set of fixed ONU locations.The CO location is denoted by c ∈ V\J .Assume fixed deployment costs c co ≥ 0, c s ≥ 0 and c onu ≥ 0 for the CO, splitters and ONUs respectively and a splitter capacity of κ.Furthermore, assume c t and c f is a cost per unit length of trenching and fibre respectively.
In the case of PONPP, the objective is to find a subset of open facilities F, with every element in U assigned to a single facility f ∈ F by distributing into U f ⊆ U, all vertices in U f connected via a Steiner tree T f rooted in f and all vertices in F connected via a Steiner tree T rooted in c so as to minimise the overall deployment cost where v u is the length of the shortest path between vertices u and v in the Steiner tree with v as root, the set E U contains every edge used by a Steiner tree and e is the length of the edge e.The first term describes the splitter cost while the second and third terms describe the fibre cost between CO and splitter, and splitter and Optical Network Unit (ONU) respectively.The final term is the total trench cost.
Typically, the problem is formulated using directed arc based flows.This approach is tractable for small data sets and for medium sized data sets if combined with strong valid inequalities.These valid inequalities are the focus of many papers [2,4,16,18] and usually allows fast convergence.When moving to larger data sets, however, the arc based formulation suffers from memory problems due to the large number of constraints and becomes intractable.
The above formulation can be transformed into a path-based formulation.Using constructed paths, this formulation automatically ensures that flow conservation holds and therefore most of the constraints can be removed, resulting in a compact formulation.PONPP can then be redefined using paths.
Define a commodity pair k ∈ K.The set K consists of all possible pairs of the CO and splitters as well as all possible pairs of splitters and ONUs.For each commodity pair k = {i, j} ∈ K, define a set P(k) ⊆ P of all non-cyclic paths between i ∈ V and j ∈ V. Next, define a set E P ⊆ E of all edges traversed in paths p ∈ P and a set P(e) ⊆ P containing all paths that traverse edge e ∈ E P .Conversely, E(p) ⊆ E P is the set of all edges contained within path p ∈ P.
Two additional constraints are applicable to PONs: maximum and differential network reach.The total length of fibre connecting the CO with an ONU j ∈ U through a splitter i ∈ S, i.e. the network reach, may not exceed a threshold total max due to optic loss.To avoid synchronisation issues between ONUs, the difference between the maximum and minimum network reach for a splitter i ∈ S may not exceed diff max [5].With paths representing fibre cables and edges representing trenches, PONPP then becomes the search for a subset of deployed splitters such that 1. each ONU connects to one and only one splitter via a single path, 2. each splitter connects to the CO via a single path, 3. a maximum of κ ONUs can connect to a single splitter, 4. the maximum and differential network reach constraints are satisfied, and 5. the sum of the deployment, path and edge costs are minimised.
The non-cyclic paths in set P would typically be calculated by forming Steiner trees rooted at c and i ∈ S with Steiner nodes i ∈ S and j ∈ U respectively.Another approach would be to start with a reduced subset of paths and use column generation to generate additional columns (paths) until no columns exist with negative reduced costs.In our approach, the paths will be generated once at the onset using a heuristic specific to PONPP.Hence, the set P will be treated as a parameter in the formulation.

Mixed integer formulation
From the above problem, a MILP model can now be formulated.Let y p indicate the usage of the path p ∈ P and x e the usage of edge e ∈ E P .Let ψ i indicate the deployment of splitter i ∈ S. Let k so ij ∈ K denote the commodity pair of splitter i ∈ S and ONU j ∈ U, and let k cs i ∈ K denote the commodity pair of the CO and splitter i ∈ S. Parameter p denotes the total length of path p ∈ P while the variables min i and max i denote the minimum and maximum network reach for splitter i ∈ S respectively.Let e be the length of edge e ∈ E P .As done by Li and Shen [17], the introduction of a binary if-then variable d ij , i ∈ S, j ∈ U and a large value, ∆, allows the formulation of PONPP as i∈S p∈P(k so ij ) j∈U p∈P(k so ij ) p∈P(e) p∈P(k so ij ) i ∈ S, j ∈ U.
Constraint set (3) ensures that each ONU connects to a splitter via a path while constraint set (5) limits the maximum number of ONUs per splitter as well as sets the splitter usage variable ψ i .Constraint set (4) ensures a path exists between each used splitter and the CO while the inequalities in (6) ensure that all edges of used paths are marked used as well.To ensure numerical stability, the values of ∆ are set as small as possible, in this case ∆ = |P(e)|.Constraint sets (7) and ( 8) limit the minimum and maximum network reach parameters for each splitter while constraint set (9) activates the previous two constraints only when paths between commodities are used.Finally, the inequality sets (10) and (11) implement the global PON fibre distance constraints.
Constraint set ( 6) can be substituted with 0 ≤ y p ≤ x e , p ∈ P, e ∈ E(p) to strengthen the model relaxation, but this increases the number of constraints of the model and the memory required to solve dramatically.Therefore, Constraint set (6) will be left as is to ensure model tractability for large instances.
Due to space limitations, the model in equations ( 2)-( 12) incorporates only the fundamental constraints inherent to PONPP.For a more complete model that includes refinements such as economies of scale, network coverage and non-symmetrical fibre costs, refer to Van Loggerenberg [27].

Paths heuristic
It is evident that the set P will include an infeasible number of paths for large graphs.Consider now how the paths are generated as a preprocessing step for each commodity k ∈ K.As a first step, P(k) contains the shortest path between commodity pair k.In this formulation of PONPP, the aim of generating additional longer paths to include in P(k) is to increase the possibility of edges being shared between paths of different commodities.However, it should be evident that during the generation procedure, a point exists where generating a longer path will not result in a decrease in the objective function value.This point is reached when the total additional fibre cost exceeds the cost saved if an additional trench segment is shared.As this path generation process is independent of the fixed costs c s , c onu and c co , the only relevant costs involved are c f and c t .This reasoning is stated formally in Proposition 1.

Proposition 1
Let p * ∈ P(k) be the shortest non-cyclic path between commodity pair k ∈ K with length p * .For PONPP as defined by (2)-( 12), the set of paths Q(k) ⊂ P(k) will not be found in the minimal solution, where q > (1 Proof: The cost of a shortest path p * ∈ P(k) with no fibre duct sharing is given by its trenching and fibre components, i.e.  14), it is found that c f p + − c t p * > c f p * , which can be rearranged to Figure 2 illustrates the significance of Proposition 1 by means of an example.The most edges that can be shared by a path between s and o 1 is to share edges (s, d), (d, e) and (e, f ) with the existing path between s and o 2 .However, if the cost saved by sharing those edges exceed the extra cost incurred to use a longer path, the path will not be selected in the minimal solution.In fact, since no two commodities can have both the same source and destination nodes, Proposition 1 tends to be conservative.Using this proposition, all paths that are c t /c f times longer than the shortest path p * will not be calculated since they will not be used in the minimal solution.In practice civil restrictions ensure even greater diminishing returns when deviating from the shortest path.This is due to the fact that trenches are made alongside roads which are usually the only access to customer premises.In these cases, the only opportunities for fibre duct sharing exist at road junctions where fibres can be routed together on one side of the road if possible.
For smaller data sets, the number of paths generated for P(k) can be adjusted through the use of a k shortest simple path algorithm such as Yen's algorithm [6,31].These algorithms typically start from a shortest path and sequentially add additional edge segments with least weight to form longer paths.Unfortunately, they usually have time complexities that increase linearly with both k and |V|, adding substantial preprocessing time as more paths are generated.It is therefore necessary to minimise the paths generated to increase computational performance.
In the case of larger data sets, the performance when including only the shortest path (k = 1) in the model is also determined, effectively using opportunistic fibre duct sharing when shortest paths take a similar route.With k = 1, the much faster Dijkstra's algorithm can be used, increasing preprocessing performance.This special case, where k = 1, will henceforth be called the shortest path heuristic (SPATH).

Disintegration heuristic
Since real-world PON data are usually grouped into interconnected neighbourhoods, a disintegration of the input data into clusters should give good computational performance while staying close to the global minimal solution.As the central office is global to all clusters only splitter and ONU nodes are clustered, or the set D = U ∪ S. A number of methods exist to cluster the PON data sets, including centroid, density and hybrid clustering.Each method is implemented and compared with respect to efficacy of computational effort distribution as well as numerical performance.

Centroid clustering
Firstly, to test centroid clustering, the common k means algorithm is used [19].This simple algorithm minimises intra-cluster distances by minimising the sum of Euclidian distances between each point assigned to cluster i, and the cluster mean µ i .Generically, the technique can be stated as follows: The k means algorithm provides the k output sets L 1 , L 2 , . . ., L k , where The algorithm is promising since it provides roughly equi-sized clusters which is useful for effective division of computational complexity.

Density clustering
The Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm was introduced in 1996 by Ester et al. [7] and is by far the most commonly cited densitybased clustering algorithm in literature [21].The algorithm incrementally creates clusters by adding points within a distance from any point already in the cluster.Another parameter M p determines the minimum number of points a cluster should consist of, discarding all isolated points as noise.In our implementation, this parameter is set to zero to avoid noise classification.The algorithm provides K output sets such that Since density estimates are used to cluster the data, these techniques are good at capturing neighbourhoods of roughly equal density and should therefore ensure that equi-dense clusters are not bifurcated, ensuring improved numerical performance.

Hybrid clustering
In 1982, Wong et al. [28] introduced a hybrid clustering approach that combines the computational performance of k means with density-based clustering advantages.Given a required number of clusters K, the algorithm executes two steps; a preliminary k means clustering with k K, followed by an iterative step that analyses and combines the output clusters at each iteration according to a density measure function ϕ until K clusters remain.The number of clusters k is usually proportional to the number of observations N and can be approximated as k ≈ 7(N/log(N )) In the algorithm, two preliminary clusters, L u and L v , are considered adjacent if the midpoint between the centroids µ u and µ v is closer to either µ u or µ v than any other cluster mean using Euclidian distance.Define the intra-cluster sum of squares as Then the density measure function ϕ in two dimensions is given by At each iteration of the algorithm, the cluster pair for which ( 18) is a minimum is combined until the required number of clusters remain.

Cluster post-processing
Valid clusters are built from the output of each of the clustering algorithms to ensure a feasible solution exists for each cluster.These clusters, defined in Definition 1, contain both splitters and ONUs and have enough splitters to serve all ONUs contained within the cluster.The minimum split ratio is defined as the capacity of a splitter, κ.

Definition 1 (Valid cluster)
Given a set of splitters S = ∅, a set of ONUs U = ∅ and a minimum split ratio κ, a set L ⊆ U ∪ S, L = ∅ is said to be a valid cluster iif A number of possibilities exist when encountering invalid clusters, the most basic of which is to simply combine the invalid cluster with another cluster until all clusters are valid.A more sophisticated approach is to combine an invalid cluster with its nearest neighbour in terms of inter-cluster centroid distance, i.e. at every iteration, an invalid cluster L i is combined with its nearest neighbour L i N N as This process continues until all clusters are valid or only a single cluster remain.This will ensure that if the original set D is valid, which needs to be true for a feasible solution to be found to the original problem, the resulting clusters will also be valid.
It should be mentioned that the efficacy of the clustering process depends not only on the underlying structure of the data, but also on the splitter capacity κ.When moving to cluster sizes smaller than the splitter capacity, the cluster bounds will intersect the splitter reach, resulting in potentially inflated deployment costs.Therefore the cluster sizes should ideally be some factor M > 1 larger than the splitter capacity κ.The clustering process is also influenced by the amount of excess capacity available.If the original dataset contains just enough splitter capacity to connect all ONUs, the probability of a cluster being invalid is high, decreasing the efficacy of the process.Therefore, ideally, κ|S| |U | should hold for the dataset to ensure efficient operation.

Computational results
The model given in ( 2)-( 12) is implemented using C++ and IBM ILOG CPLEX Concert extensions.It is then solved on a quad core Intel Core i7 at 2.67 GHz with 16 GiB main memory running Windows.Deterministic parallel processing is enabled to increase computational performance.

Path heuristic
Firstly, the path heuristic efficiency can be tested by varying the number of shortest paths between 1 (opportunistic fibre duct sharing) and 100.This is done using three small data sets containing less than 160 ONUs each, with parameters as in Table 1.These data sets were constructed using small subsets of real-world data sets generously provided by atesio GmbH [3].Table 2 contains the results, with P k the number of shortest paths, t s the time to solve and M EM p the peak memory consumption in MiB during the test.The GAP b % is calculated according to the best known integer bound for each of the data sets, which were calculated using a standard arc flow formulation running for 3 hours each.The best bound optimality gaps are given in the GAP % column of Table 1, with SR avg showing the average split ratio available.In this case, the relative differences is important to determine the typical influence fibre duct sharing can have on the objective function value, and how the addition of extra paths affect both the computational effort and the deployment cost.The results support the notion of diminishing returns as more shortest paths are introduced into the model, with deployment cost savings only improving by 2 % on average when moving from k = 1 to k = 100 and with most of the savings attained when moving from the shortest path to k = 10.While the typical memory requirements increase linearly with k, the computational effort required increases exponentially.Therefore, it might be deemed worthwhile to use a lower number of paths that will result in good bounds without requiring an infeasible time to solve, especially when solving large instances.It should be noted that almost all of the instances were solved in a fraction of the time with comparable or equivalent numerical performance to the best calculated bound.In the case of sub 112, optimality was attained in just a few seconds compared to the 1.5 hour computation time of the arc flow formulation.

Disintegration heuristic
The three clustering methods are compared using a real-world GIS-mapped dataset containing 6 698 nodes and 7 660 edges, called CityNet [3].This dataset contains a total of 2 190 splitters and ONUs to be clustered.Cluster metrics are then calculated, including the average intra-cluster distance d i avg , average cluster size |L| avg , cluster size standard deviation |L| dev , maximum cluster size |L| dev and number of valid clusters |L i | valid .The clustering results, as well as the clustering processing time t c are shown in Table 3.
Looking at the clustering results, the strengths of each method is apparent.Firstly, as expected, the k means algorithm provides equi-sized clusters, with the lowest maximum cluster size and low cluster size deviation.DBSCAN delivers very low average intra-cluster distances, illustrating its efficacy in correctly clustering dense regions.Unfortunately, the maximum cluster sizes and cluster size deviations of DBSCAN is large, which will negatively impact computational effort distribution.Wong's hybrid clustering has slightly better average intra-cluster distances in the K = 10 scenario in comparison to k means with the same number of clusters, showing some improvement with the introduction of densitybased clustering.However, like DBSCAN, the hybrid method suffers from high cluster size deviations and large maximum cluster sizes.Finally, k means provided no invalid clusters in any of the cases, whereas the hybrid method started deviating at K = 50.
Even with this relatively large dataset, all clustering methods performed amicably, completing the clustering in a few milliseconds.Overall, it seems as though the standard k means algorithm will provide the best computational effort distribution and the lowest overall computation time due to its low maximum cluster size value.
Next, CityNet is solved through the use of a modified version of the Branch Contracting Algorithm (BCA) heuristic proposed in [22].BCA is chosen since this heuristic explicitly includes elements of fibre duct sharing through a tree-based clustering method.The authors claim performance of 10-15 % deviation from optimal with very fast computational speed.Since the original article does not specify values for Q, a grouping factor, the algorithm is run with all practical values for Q, i.e. 0 < Q ≤ κ, taking the minimum objective value.Next, since BCA is randomly initialised, it is run 20 times for each Q value, again noting the minimum objective value.This ensures a fair comparison with the proposed path and disintegration heuristics.
It should be noted that the last step of BCA requires a heuristic Steiner tree implementation to connect all splitters to the central office.Since the details of this heuristic are not clear from the original article, the algorithm is modified to connect splitters through shortest path routes to the central office, sharing fibres as possible.This produces an upper bound on the true BCA objective value.BCA without any connecting fibres between the central office and splitters (BCANoF) is also tested to provide a lower bound and ensure the modification does not produce biased results.
The model was solved incorporating the shortest path heuristic (SPATH) with no disintegration and a 1 hour time limit.Then, the clustering scenarios defined in From the numerical results in Table 4 it is clear that the BCA heuristic is much faster than the clustering methods, but produces an increase of up to 44 % in objective value.It should, however, be noted that the time to solve for BCA is specified as the time to complete one iteration.To get the minimum values for BCA, the algorithm was run for approximately 40 minutes with the various parameters.Another observation is the variance in BCA due to its random initialisation.Given a fixed Q value, the heuristic gave solutions with objective values varying by up to 12%.Therefore it is critical that the algorithm is run a number of times to ensure a good solution is produced.
Computation times for the clustering methods are consistent with the maximum cluster size values obtained in Table 3, with maximum values of over ±700 resulting in a suboptimal run at the end of the 1 hour time limit.In this regard, the standard k means algorithm produces the best results considering the number of valid clusters, showing its efficacy in computational effort distribution for PONPP.
Peak memory usage values are once again consistent with the maximum cluster size values, as this determines the largest sub-problem.The k means algorithm produced excellent numbers across the board, with H50 just managing to produce the lowest memory usage of 298 MiB.Most of the instances compared favourably to the memory usage of BCA, which peaked at around 5 GiB due to the memory required to build and maintain the initial tree, although there are a number of optimisations that can be implemented to reduce this value.
As for objective function values, SPATH produced the best upper bound with R 53.43 mil, with DB80, k10 and H10 all producing good values within approximately 3%.Due to the low average intra-cluster distance of DBSCAN, it produces the best bound of the clustering methods in 1 hour.Comparing the standard k means algorithm with the hybrid algorithm, the k means algorithm gave slightly better bounds under the time constraint.Similar objective values for SPATH and the clustering indicates that the best clustering instances does not introduce errors of more than 14%, although the actual error margin may be much lower.

Conclusion
In this paper two heuristic techniques were incorporated into a MILP model of PONPP, allowing for the optimisation with the inclusion of fibre duct sharing.The numerical and computational results of the path heuristic in small scale tests showed promising performance, with drastically reduced computation times and less than a 3% gap in comparison with the best calculated bounds across all data sets.This could indicate that in practice, paths that are much longer than the shortest path rarely result in lower deployment cost, indicating that fibre duct sharing opportunities may be limited in real-world deployments with civil restrictions.
Given the numerical results, both SPATH and the clustering methods outperforms BCA by quite a margin, even when BCA is given the best possible chance, producing up to 44% lower objective values.Time complexity wise, the heuristics dramatically reduce computation time, although BCA is still faster by an order of magnitude.In practice, this discrepancy is reduced since BCA needs to be run a number of times to produce a good solution.
Overall, the heuristics proved to be very capable at solving PONPP with high accuracy and with fast computation times.The results suggest that the standard k means algorithm is best suited for clustering PONPP, providing very good bounds at a fraction of both the computational effort and memory required.Unfortunately, worse than claimed performance for BCA suggests that it may be unsuitable for practical and inherently clustered data sets such as CityNet.
Following this research, a more connectivity-aware clustering method can be investigated to take advantage of the nature of PONPP.Also, the estimation of the true distance from optimum for CityNet would be interesting to determine the effectiveness of the SPATH heuristic when applied to large instances.Also, the data sets can be preprocessed to reduce its complexity through edge substitution, as is done in a large number of other studies.

Figure 1 :
Figure 1: A schematic layout of the PON topology.

Figure 2 :
Figure 2: Fibre duct sharing opportunity when allowing for longer paths.
) c t e − e∈Es c t e + c f p + .It is evident that if c p + > c p * , path p + will not be used in the minimal solution.Substituting and simplifying, The maximum sharing that can occur is if all edges are shared, i.e.E s = E(p * ).Furthermore, from the definition of a path, it is evident that p c p * = e∈E(p * ) c t e + c f p * .It follows that any fibre duct sharing will result in a longer path p + with length p + .Let E s ⊆ E(p * ) be the edges path p + shares with other paths.Therefore, the total cost of path p + can be given by c p + = e∈E(p * * = e∈E(p * ) e .Substituting into equation (

Table 2 :
Numerical and computational results for the path heuristic test.

Table 3 :
A comparison of the three clustering algorithms.

Table 4 :
Table3above were solved using the shortest path heuristic.The results for k means are denoted with k10 to k50, DBSCAN with DB20 to DB80 and the hybrid clustering results with H10 to H50.The optimality gap (GAP b ) is specified in terms of the best upper bound found among all the instances.For the instance that produced the best bound the normal branch and bound optimality gap is given.Peak memory usage in MiB during the tests is given by MEM peak .If the total time to solve, t solve , exceeds the time limit, a ">" sign is placed next to the value.Finally, the number of splitters deployed is given in the SPs column.Numerical and computational results for the CityNet dataset.
*Optimality gap between best upper integer and best lower relaxation bound