On the eﬃcient solvability of a simple class of nonlinear knapsack problems

In this paper the eﬃcient solvability of a class of nonlinear knapsack problems are investigated by means of the problem’s necessary and suﬃcient conditions. It is shown that, from the general theory, it is impossible to determine suﬃcient conditions for a solution to be globally optimal. Furthermore, it is shown that even for the smallest possible instance of this problem it is, in general, possible to have an arbitrary large number of solutions for which the necessary conditions hold. These results are then generalised to larger instances of the problem. A possible solution approach is applied to sets of randomly generated problems that utilises the necessary conditions together with the branch-and-bound technique in an attempt to limit the search space. This approach solves mixed 0/1 knapsack problems in order to ﬁnd all possible solutions satisfying the necessary conditions. Due to the large number of solutions satisfying the necessary conditions the proposed solution approach takes substantially longer than existing branch-and-bound algorithms together with linear enveloping when applied to the same set of problems. This result renders the proposed approach not very eﬃcient.


Introduction
Knapsack problems have a wide range of applications in, for example, marketing, chemistry, information technology, portfolio optimisation, optimal search strategies, production planning, logistics and statistical sampling [1,2,7,8].The specific problem formulation considered in this paper arises in the optimal allocation of funding to different, independent projects within a fixed budget.Depending on the expected return on each project, this problem may be formulated as a separable nonlinear knapsack problem in which the objective function consists of the sum of increasing convex and increasing concave functions, subject to a single constraint, and with lower and upper bounds on the allocation to each project.Mathematically this problem may be formulated as where f i (x i ) is an increasing convex function and g j (y j ) is an increasing concave function and B is the size of the budget.Furthermore l = 0 for all i = 1, 2, . . ., m and j = 1, 2, . . ., n, because this may always be achieved by means of a simple linear transformation [2].
The remainder of this paper is structured as follows.In §2 the smallest possible instance of the problem with m = n = 1 is considered, and it is shown that only necessary conditions for local optimality can be determined by means of the general theory on nonlinear programming.A generalisation to n > 1 and m > 1 is considered in §3.Some thoughts and insights on possible solution approaches are presented in §4 and these are followed up with some computational results obtained via these approaches.Finally, some conclusions, together with ideas for future study, follow in §5.

One convex and one concave function
The special, and smallest possible, instance of (1), where the objective function consists of only one concave and one convex function, is considered in this section because it clearly demonstrates the computational complexities in solving this problem.Let the decision variables be x ∈ [0, u x ] and y ∈ [0, u y ].The objective in this case is then to For the formulation in (2) the second order necessary conditions [3] require that for a point (x * , y * ) in solution space to be a local maximum there must exist a λ ∈ R and µ 1 , µ 2 , µ 3 , µ 4 ≥ 0, such that and, with ∂y 2 , the Lagrangian matrix must be negative semi-definite on the tangent plane determined by the binding (or active) constraints.
The second order sufficient conditions for a strong local maximum is that (3)-( 9) must hold and that the Lagrangian in (10) must be negative definite on the tangent plane determined by the binding constraints.
To determine necessary and sufficient conditions it is therefore important to consider all possibilities in which the constraints may be binding.Exactly one of the following three cases may arise: 1.
x + y = B is the only binding constraint,

2.
x + y = B and 0 ≤ x ≤ u x (i.e.either x = 0 or x = u x ) are the only binding constraints, or 3. x + y = B and 0 ≤ y ≤ u y (i.e.either y = 0 or y = u y ) are the only binding constraints.
In all three cases additional conditions are required for a feasible solution to be an optimal solution candidate.The following theorem summarises all the cases with their respective additionally required conditions.However, in the theorem we require the notion of curvature.If C is a smooth curve of the graph y = f (x), then the curvature of C at the point (x, y) is given by [9]

Theorem 1
Let (x * , y * ) be a feasible solution to (2) for which x * + y * = B is a binding constraint.
A If additionally x * = 0 (or alternatively y * = u y ) is binding, and if f (x * ) < g (y * ), then the second order necessary conditions for the point (x * , y * ) to be a relative maximum point hold.
B If no other constraints are binding, then f (x * ) = g (y * ).Furthermore, if |K f (x * )| < |K g (y * )|, then the second order sufficient conditions for the point (x * , y * ) to be a strong relative maximum point hold.
C If additionally x * = u x (or alternatively y * = 0) is binding, and if f (x * ) > g (y * ), then the second order necessary conditions for the point (x * , y * ) to be a relative maximum point hold.
The proofs of Parts A and B are supplied.The proof of Part C is omitted, because it is similar to the proof of Part A. The cases of Theorem 1 are summarised schematically in Figure 1.
Relative maximum

Strong local maximum
Relative maximum Proof of A: From the conditions that x + y = B and x = 0 are binding constraints in (2), as well as from ( 7)-( 9), it follows that µ 2 = µ 3 = µ 4 = 0.Because x = 0, µ 1 may assume any non-negative value.If this result is substituted back into (4)-( 5), it follows that In the alternative situation, i.e. when x + y = B and y = u y are binding, then by the same argument it follows that µ 4 ≥ 0 and The functions f (x) and g(y) are both increasing and therefore f (x) ≥ 0 and g (y) ≥ 0.
The second order necessary conditions [3] for optimality of a point (x * , y * ) are that the point should be a regular point and that the Lagrangian should be negative semi-definite on the tangent plane determined by the active constraints.For (x * , y * ) to be a regular point, the vectors should be linearly independent at the point (x * , y * ), which they are.
At the point (x * , y * ) on the tangent plane it holds that which means that z 1 + z 2 = 0 and z 1 = 0.For the second order necessary conditions it should hold that z T L(x * , y * )z ≤ 0, which is satisfied since z 1 = z 2 = 0.
In the situation described in Case A of Theorem 1, no sufficient conditions for optimality can be found form the general theory on nonlinear programming.We thus have to conclude that if the point (x * , y * ) is a feasible solution for (2) such that x * + y * = B and x * = 0 (or y * = u y ) are the only binding constraints, then general second order sufficient conditions for the feasible solution (x * , y * ) to be a strong relative maximum point for (2) cannot be found.
Sufficient conditions for local optimality at the point (x * , y * ) require that, in addition to the requirement that the first order necessary conditions must hold, the Lagrangian L(x * , y * ) must be negative definite on the tangent plane M defined by at the point (x * , y * ).Thus, for any point z = (z 1 , z 2 ) on the tangent plane, z 1 = −z 2 holds.For L(x * , y * ) to be negative definite on M it is required that From the properties of f (x) and g(y) it follows that F (x * ) ≥ 0 and G(y * ) ≤ 0. Hence The same conclusion, namely that in general no sufficient conditions for optimality can be found from the general theory, follows from the proof of Part C.
In order for the necessary conditions to hold, it follows from Theorem 1, that one of the following must hold for any feasible solution (x * , y * ) to (2) so that the feasible solution may be considered a candidate for a globally optimal point: 1. x * = 0, 0 < y * < u y and f (x * ) ≤ g (y * ) In this situation no sufficient condition for global optimality can be determined from the general theory.2. x * = u x , 0 < y * < u y and f (x * ) ≥ g (y * ) In this situation no sufficient condition for global optimality can be determined from the general theory.3.
In this situation no sufficient condition for global optimality can be determined from the general theory.4. y * = u y , 0 < x * < u x and f (x * ) ≤ g (y * ) In this situation no sufficient condition for global optimality can be determined from the general theory.5. 0 < x * < u x , 0 < y * < u y and f (x * ) = g (y * ) In this situation the necessary conditions can be changed to sufficient conditions for a locally optimal point if |K g (y Practically it means that the concave function should curve "more" (i.e. have a larger curvature) for the sufficient conditions to hold at the point (x * , y * ).
If all the situations above are considered, it would seem that the following fairly simple algorithm may be used to solve (2): 1. find, if they exist, solutions for which x + y = B and 0 ≤ x ≤ u x are binding, i.e.
x 3 = (B, 0) and x 4 = (B − u y , u y ); 3. find, if it exists, a best solution for which only x + y = B is binding, i.e. find x 5 = (x 5 , y 5 ), by performing a search for point(s) satisfying 0 < x < u x , 0 < y < u y , x + y = B, f (x) = g (y) and f (x) ≤ g (y); and, finally 4. select the best solution from the five obtained above.
Steps 1 and 2 above are computationally inexpensive.To select an optimal solution (in Step 4) does not bring about an additional computational burden either.Furthermore, it is computationally simple to find one point for which the conditions in Step 3 hold numerically.A problem would, however, arise if there are a number of these points, in which case each of the points would have to be enumerated.To determine the number of points for which the conditions in Step 3 hold is equivalent to answering the following question: how many times can the difference between two increasing, convex functions be zero?Nahay [6] showed, by means of an example, that it is possible that this difference may intersect any value (and hence zero) an arbitrarily large number of times on a finite interval!In general, there is thus no upper bound on the number of points that have to be enumerated in Step 3.An algorithm based on the four steps above would therefore not be suitable as an algorithm to find an optimal solution to (2).It could, however, work as an outline for a heuristic approach towards solving this problem.
We note that the problem with only two functions considered in this section may be transformed to a problem containing only one function in one variable.This renders an unconstrained optimisation problem in one variable.Although it is a simpler formulation, it does not shed any new light on solving for an optimal solution to (2).

Generalisation to more functions
In the generalisation to more functions the increase in the number of concave functions poses no real problem.If the concave part of the formulation is seen as a subproblem (when m = 0) it is a well researched problem for which efficient algorithms have been established [8].The same does not apply for the convex functions.An increase in the number of convex functions increases the complexity of the problem considerably.
Consider problem (1).In general the second order necessary conditions state that for a feasible solution x * = (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) to be optimal, there must exist a λ ∈ R and µ 1 , . . ., µ m , ν 1 , . . ., ν m , η 1 , . . ., η n , ω 1 , . . ., ω n ≥ 0 for which g (y * j ) + λ + η j − ω j = 0, j = 1, . . ., n and for which is negative semi-definite on the tangent plane M determined by the active constraints, where This implies that for z 1 , . . ., z m , w From ( 19) it follows that Let M 1 be the tangent plane on the equality constraint, M 2 the tangent plane on the active lower bound constraints and M 3 be the tangent plane on the active upper bound constraints.Then M = M 1 ∩ M 2 ∩ M 3 , and thus for z i , w j ∈ M it follows that Theorem 1 may be generalised to more functions using this result.All the variables of a feasible solution (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) may be partitioned into three sets, namely Each entry of a feasible solution will be in exactly one of these three sets, because the set L contains all the variables that are at their lower bounds, the set U contains all the variables that are at their upper bounds and the set K contains all the variables neither at their upper nor at their lower bounds.If a feasible solution is represented in this fashion it follows from ( 11)-( 17) that All the functions with their variables in the set K have exactly the same slope at a feasible solution for which the first order necessary conditions hold.We call this slope the consensus slope, S c .The result above may be summarised as for all x * i , y * j ∈ U in terms of the consensus slope.There is another result that may be used to limit the search space in order to facilitate development of a possible algorithm to solve the problem formulation in (1).Two different proofs of the following result were presented by Venter and Wolvaardt [11], and by Visagie [12].

Theorem 2
If a feasible solution x * = (x 1 , . . ., x i , . . ., x m ) is an optimal solution to (1) with n = 0, then at most one element of x * is in the set K.
The following generalisation of the above result holds for the problem formulation in (1).
Proof: The formulation in (1) may be divided into a master problem and two subproblems.
The first subproblem (P 1 ) contains all the convex functions and the second subproblem (P 2 ) contains all the concave functions.The subproblem P 1 is given by maximise and subproblem P 2 by maximise The master problem is then given by If an optimal solution (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) to (25) exists, then (x * 1 , . . ., x * m ) must be an optimal solution to (23) and (y * 1 , . . ., y * n ) must be an optimal solution to (24).If this were not the case for one of the subproblems, then it would be possible to obtain a better objective function value for that subproblem, with the same amount of resources.However, then this improved solution may have been used as is in the master problem, without any effect on the other subproblem.This would then lead to a better solution for the master problem which is not possible, because the original solution was already optimal.Hence the solutions (x * 1 , . . ., x * m ) and (y * 1 , . . ., y * n ) to both subproblems must be optimal.If the solution (x * 1 , . . ., x * m ) is an optimal solution for (23), then Theorem 2 must hold for (23) and thus at most one of the convex function variables may be in K.
Theorem 1 may now also be generalised.

Theorem 4
Let (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) be a feasible solution to (1), for which A f i (x * i ) ≤ S c and g j (y * j ) ≤ S c for all x * i , y * j ∈ L, B f i (x * i ) = g j (y * j ) = S c for at most one x * i ∈ K, and for all y * j ∈ K and in addition, if there is one x i ∈ K, then |K f i (x i )| < | j K g j (y j )| for all y * j ∈ K, and C f i (x * i ), g j (y * j ) ≥ S c for all x * i , y * j ∈ U hold.Then the first and second order necessary conditions for (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) to be an optimal solution hold.Thus all solutions (x * 1 , . . ., x * m , y * 1 , . . ., y * n ) having either one of the two properties described below, must be determined to find a globally optimal solution.
1.If there is exactly one x * i ∈ K, say x * k (and thus f k (x * k ) = S c ), then S c = g j (y * j ) and f k (x * k ) < j g j (y * j ) for all y * j ∈ K, f i (x * i ) ≤ S c and g j (y * j ) ≤ S c for all x * i , y * j ∈ L, and f i (x * i ) ≥ S c and g j (y * j ) ≥ S c for all x * i , y * j ∈ U, or else 2. if there is no x * i ∈ K, then S c = g j (y * j ) for all y * j ∈ K, f i (x * i ) ≤ S c and g j (y * j ) ≤ S c for all x * i , y * j ∈ L, and f i (x * i ) ≥ S c and g j (y * j ) ≥ S c for all x * i , y * j ∈ U.
With an increase in the number of convex functions the situation becomes worse than the situation with only one convex function, even though there may be at most one x i ∈ K, because the chances increase that there is more than one point for which f i (x i ) = g j (y j ) = S c holds for all x i , y j ∈ K.For more than one such point to exist there must be at least two feasible solutions for which m i=1 x i + n j=1 y j = B, f i (x i ) = g j (y j ) for all x i , y j ∈ K and i∈K z 2 i f i (x i ) ≤ j∈K w 2 j g j (y j ) for all z i , w j ∈ M holds.This implies that there should be at least one feasible solution for which with at most one z i = 0.An increase in the number of convex functions translates to an increase in the number of choices on the left hand side of (26) to ensure that the equality holds.

Numerical results
For some types of problems (such as certain bin packing problems) standard sets of problem instances exist which may be used by researchers to measure the performance of new algorithms.Unfortunately, no such benchmark sets exist for the problem considered in this paper and therefore instances needed to be generated for the purpose of testing the algorithm described below.Eight families of increasing convex functions and increasing concave functions were used to create these instances as described in the following sections.

A possible solution approach
It would seem a logical approach to design an algorithm for solving (1) repeating itself n times.During each of these n repetitions another convex function is considered as the convex function of the decision variable in K.As a first attempt one may approximate the remaining n − 1 convex functions by enveloping them by means of straight lines.This ensures that these envelopes are indeed concave as well.Such an approach therefore has the advantage of considering all but one function as concave functions -and then using the available, efficient algorithms to solve the problem in the case where there are only concave functions.The problem with this approach is that the solution will almost always include an enveloped convex function (i.e. a straight line) that assumes a value between its bounds at an optimal solution candidate, which is, of course, disallowed.To avoid this, the algorithm may be altered to incorporate a binary branch and bound algorithm.
In other words, as soon as the algorithm solves the problem and an enveloped function assumes a value between its bounds, the problem is branched into two subproblems.In the first subproblem this variable is set at its lower bound and in the second subproblem the variable is set at its upper bound.This implies solving a mixed 0/1 knapsack problem by a branch and bound technique.An algorithm based the logic of the Horowitz-Shani algorithm, see for example [5], may be used to solve this mixed 0/1 knapsack problem.
The solution times are listed in Table 1.The table compares the solution times of piecewise linear enveloping together with the branch and bound technique, which also yields the optimal solutions, see for example [1].These solution times were recorded on a Pentium(R) 4 computer with a 3.2 GHz central processing unit and 1 GB random access memory.A hundred random instances of the problem were generated and solved for each reported value of n.It is clear from the data in Table 1 that piecewise linear enveloping, with the branch and bound technique, outperforms the algorithm described above, when both are used to solve the same problem sets.Because of the poor solution times achieved by the proposed algorithm it is not presented in detail here.
Dynamic programming was used to validate the algorithmic results presented here.

Conclusions and ideas for further research
The main reason why the analytic, continuous approach to the problem formulation in (1) is not successful is that there are potentially many feasible solutions for which the necessary conditions hold.In fact, it may be shown that there may be arbitrarily many feasible solutions to (1) for which the conditions in Theorem 1 hold.This renders a brute force search for all these feasible solutions impractical.This conclusion remains the same if the results of Theorem 3 are used to supply additional necessary conditions in a bid to limit the search space for all feasible solutions for which the necessary conditions hold.
The possibility of constructing a heuristic that searches for only the first N points for which the necessary condition holds may be investigated further, for some specified value of N .
An analytical, continuous approach does not compete favourably with linear enveloping in conjunction with branch and bound algorithms.Research into improving the manner of performing piecewise linear enveloping, as well as on how the branching should be

2 ,
. . ., m) are respectively lower and upper bounds on the decision variables x 1 , x 2 , . . ., x m .Similarly l (y) j and u (y) j (j = 1, 2, . . ., n) are respectively lower and upper bounds on the decision variables y 1 , y 2 , . . ., x n .We assume that solutions.No generality is lost by setting the lower bounds of all the decision variables to zero, i.e. by taking l

Figure 1 :
Figure 1: Schematic representation of the cases in Theorem 1.

Table 1 :
Total solution times in seconds for 100 randomly generated instances of the problem for each reported value of n.The following abbreviations are used: NBB = normal branch and bound technique with piecewise linear enveloping and MBB = mixed 0/1 branch and bound algorithm.