ON SOME OPTIMAL SWITCHING STRATEGIES

Systems of components with increasing failure rates are considered. Switching from an operating component on standby can be performed at each instant of time. Optimal switching strategies, maximizing the time to the first failure of a component and to the total failure of a system are investigated. A new type of a strategy: the limit strategy is introduced. It is proved that this strategy is optimal, when there is no additional information on the state of the system. Some simple examples are considered.

a component time to failure ) (t F is absolutely continuous and the corresponding failure rate ) (t λ is increasing.Thus, ∈ ) (t F IFR (increasing failure rate) [1].The system is nonrepairable.The first component starts operating at 0 = t , while the second one is switched off ("cold" standby).After a failure of the first component, the second starts operating immediately.The foregoing defines a usual strategy of switching upon failure.
Assume now that switching from the operating component to the standby can be performed instantaneously at any moment of time.The former operating component then starts to be the standby and vice versa, thus defining a preventive switching strategy.As will be shown, this strategy can be very effective in various situations when we want to decrease the probability of a "forced" switching upon a component's failure.A practical example is of a system with an "unreliable" switching device.If, in this case, a failure of a component occurs and the switching device is in the failed state, then a failure of the whole system occurs.Preventive switching can decrease the probability of this event.
Therefore it is worth finding a switching strategy that, under given assumptions, maximizes the probability of the system's performance without any failure of components in a given time interval.Of course, a strategy of this kind cannot change the probability of system's failure (when both components are failed), but extending the operation period without failures of components can in itself be very important in various applications. . ( This simply follows from the fact that, assuming only one switch at In order to maximise this probability, we need After differentiating the sum of integrals in (2) and equating the result to zero, we arrive at Equation ( 3) has a trivial unique solution for increasing functions, namely: 2 t a = .Due to the additivity of the integral an additional switching cannot improve this result: the total functioning time of each component should be scheduled as 2 t .
Denote by l S , where " l " stands for " limit" , the following strategy.
and the same inequalities are valid for the limit case when .0 → ∆t Moreover, it can easily be seen from definitions (4) and (5) that as which leads to the anticipated weak convergence result: . Hence, time to the first failure is stochastically larger [4,5] under the optimal strategy then under any other.
Another situation when only l S is optimal in the above sense is when the operation of our system of two components can be terminated by some external random event.It may be some device in series with the system, for instance.It is clear that l S maximizes the probability of system' s performance without failures of components (and the time to the first failure as well) in the presence of this random termination.
The strategy l S is, of course, a mathematical idealization.It is obvious that in practice t ∆ cannot be very small because each preventive switching requires some effort.The switching device can also be unreliable, but, unlike the switching upon failure, the state of this device can be checked prior to the preventive switching.Thus, a practical realization of l S can be formulated in the following way: perform switchings as often as reasonable.Given the corresponding costs and rewards, the problem of obtaining some optimal op t ∆ can be considered, but this is a topic for future study.

SOME GENERALISATIONS
In the same manner a system of n identical components (one operating and 1 − n standby) can be analyzed.The strategy that maximizes the time to the first failure is that when n -1 equidistant switchings are scheduled.Then, using the same approach, ) , , ( 1 2 t t S P , the probability of a system of 1 − n components performing without failures in ] , ( 1 t t , where 1 t is the time of the first failure, can be maximized etc.This results in the optimal strategy S at each step that can easily be derived.The corresponding l -strategy is obviously characterized by the following sequence of DFs of times between consecutive failures of system' s components: is " the recalculated starting age" of the integrated system immediately after the second failure of a component etc (we are excluding the operating times of the failed components).Eventually the concept close to that of minimal repair is reached [2].
It should be noted that this procedure is optimal in the above sense only up to and including

2.1
Consider now the following system of three identical non-repairable components with increasing failure rates.Two components in series start functioning at 0 = t , while the third one is in the " cold" standby state.After the failure of an operating component, the standby starts operating immediately.The switching from any operating component to the standby can be performed instantaneously at any moment in time.The problem is to find the switching strategy that, under given assumptions, maximizes the probability of the system performing without failure in a given time interval.It is important to note that, in contrast to the setting of Section 1, we are now looking at the total failure of the system: while the last two components are operating (there is no remaining standby component), one of them fails.
and finally to the simultaneous equations This system has an obvious unique solution (since The corresponding l S for this case is defined in the following way.Let the first and the sec- and finally: This result is intuitively obvious, because, given 3 a , we can only define an optimal strategy for the first and second components in ] , 0 ( x , which leads to (14).In reality, x (or 3 a ) is random and as a result only the − l strategy is optimal for the two components over all realizations.What is more, we have arbitrarily assumed that the third component failed first, but it could have been any of them.Hence, for the three components starting at 0 = t and operating up to the first component failure, only l S can " service" all situations, being the unique opti- mal strategy for the case under consideration.It is worthwhile mentioning that, as a result of (12), l S is trivially optimal for realizations, which have no component failures in ] , 0 ( t .Actually, this strategy performs the following operation with our system of increasing failure rate components: At every instant in time it asymptotically ( 0 → ∆t ) tends to " place" the best components in the operational state, while after the system' s failure the remaining nonfailed component is more " worn-out" under this strategy than under any other.The best component is defined as the one having the minimum current failure rate.Thus the principle " the best component should function first" is realized.We shall come back to discussing this principle while considering some examples.
The above approach can easily be generalized to a system of k identical components (with increasing failure rates) in series, which starts operating, and has n standby components.The corresponding − l strategy before the first failure is defined by , the failure rate of the " integrated component" , and then relations similar to (10) can be used.This strategy is optimal, maximizing the probability of failure-free performance of the system in a given interval of time as well as the mean time to failure.

3.1
Consider first, as in Section 1, a system of two identical components (main and standby) with exponential DF: Assume that each component of this system can be instantly repaired upon failure, but the number of repairs is bounded by 0 ≥ m .A situation of this kind often happens in practice.
The " total failure" of each component occurs when m repairs have already been performed and the component fails again.Thus, the notions of " failure" and " total failure" differ.The DF of time to total failure follows the Erlangian pattern: with increasing failure rate [3]: What strategy should be used to maximize the probability of the system functioning without total failure of any component in the given interval of time ] , 0 ( t ?In Section 1 this probability was denoted by )., ( 1

t S P
A formal answer, based on the previous results, is that switching only at a t = / 2 (and using the l S strategy for maximizing 1 T ) should be used for this purpose.Previously the distribution ) (t F was a so-called " black box" DF [2] while now (16) has a concrete form.But the main fact is that now we are able to perform the dynamic strategy of switching, based on the information at hand.This information is just the number of repairs left for each component at any instant of time.From a simple probabilistic reasoning it follows that an optimal strategy would be any strategy opt S that leaves the remaining component in the state with no repairs left after the total failure of the other component.Indeed, it was stated earlier that for this case the switching could not change the probability of the system performing without failure in ] , 0 ( t (the failure of the system is defined as the total failure of both components).In other words: the total number of cycles till the system' s failure is 2 2 + m where each cycle has DF (15).Thus, for any optimal strategy of the described type: = ) , ( (and l S ) is worse than the dynamic optimal strategy based on information.The simplest opt S for the case under consideration is the following strategy: the first component starts operating and is replaced by the standby only after the m th failure (it is instantly repaired and there are no more repairs left).The former standby operates till its total failure.It is clear that this strategy maximizes 1 T as well.

3.2
Consider now the system of three identical components described in Section 2, each component satisfying assumptions (15) and ( 16), two of which are in series and one on standby.We are looking at the total failure of the system: this occurs when two components are operating (there is no remaining standby component) and one of them fails and has no more repairs left.It was proved that the l S strategy of Section 2 is an optimal black box strat- egy for this case as well (since ) (t λ is increasing).We shall define an optimal information based strategy of switching for the described system.It is clear that the goal of this strategy is to decrease r , the random number of repairs left ( ) 0 m r ≤ ≤ for the remaining component after the system has failed.Because the total number of possible of repairs is fixed ( m 3 ), the random number of cycles n (a cycle is defined as the period between successive failures governed by an exponential DF with failure rate λ 2 ) will be maximal in some sense and this will lead to the desired result.Indeed, it is clear that . Denote by o r and a r the number of repairs left for the remaining component under the optimal (to be defined) strategy and under some arbitrary strategy, respectively.Define an optimal strategy opt S as the strategy in accordance with which the best component should function first.The best component is defined as the one having the most repairs left.Thus under the optimal strategy, each time a failure of the component occurs, it is replaced by the standby, if its remaining number of repairs is more than the number that remains after the failure (and instantaneous repair) of the operating component.It is clear that this strategy achieves the following inequality: which should be understood [4,5] in terms of stochastic ordering (stochastically smaller) i.e:As previously, inequality (20), means that the expected time to failure of the system under opt S is larger (not smaller) than under the other strategy.Hence, an optimal information based strategy is better than the formal optimal black box strategy S l , defined in Section 2.

3.3
What is the possible interpretation of the l S black box principle (the components with lower failure rates should function first) in the situation with information at hand?We shall illustrate this using the simple example given in Section 3.1.There should be no switching at all before the first failure of the operating component, because, due to the exponential DF (15), the components are statistically identical at any instant of time prior to the first failure.Hence, after the first failure we replace the failed component by the standby one, which due to (22), will have a smaller conditional failure rate than the failed component; after its first failure both components will have equal conditional failure rates, etc.It is clear that this strategy is optimal in the sense of equation ( 17).Actually, it can be defined as the rule: " the best should function first" , which was described in Section 3.2.The point is that l S implements this principle statistically, without additional information (and that is why it is only suboptimal), while the optimal dynamic strategy is information based.The heuristic considerations, presented in Sections 3.2 and 3.3, are intuitively evident (including inequality (19)), but need further mathematical justification in the future.

the 1 −
n th failure.In other words: the optimal strategy increases the times to the first, second,… 1 − n th component failures by decreasing the time between the 1 − n th and the last component failure.
follows from the integral representation of the failure rate for the Gamma (Erlangian) DF[3]: