DISTRIBUTION-FREE APPROACH TO THE EVALUATION OF RELIABILITY OF COMPLEX SYSTEMS

In the industrial community it is well known that the failure rate of the manufactured units vary with time due to a variety of causes, namely, engineering design, manufacturing process, maintenance and quality inspection procedures and various assignable and non-assignable factors. Such failure rates invariably exhibit changes in both level and slope and at times exhibit periodic patterns as well. Therefore it would be quite inappropriate and erroneous to analyze such stochastic series of observations using the usual failure distribution approach. Since such data can be construed as time series, we suggest in this paper the time series techniques including the Kalman filter for their analysis. Other advantages of using the latter techniques are that the periodicities, if any, can be taken into account and short-term forecasts can be made which otherwise would not have been possible.


INTRODUCTION
In the present age of fast-moving technological development all industries, institutions and organizations use highly complex systems and it is natural for them to expect efficiency, quality and reliability of these systems.However, in practice systems do breakdown earlier than expected, sometimes complete failure occurs.The usual measures of breakdowns are the time-to-failure, the failure rates and the time-between-failures, for the assessment of the quality and reliability of systems.The data are collected and analyzed using the techniques of mathematical reliability theory developed over the last few decades.The majority of these techniques have been developed on the assumption that the failure data arise from certain probability distributions such as the exponential, Weibull, normal and log normal and gamma failure laws.
The most popular of these laws is the exponential failure law mainly for the following reasons: 1) It does represent some failure data.
2) It has only one parameter and the failure rate is constant.
3) Under the assumption of this failure law, the results can be derived easily and in closed form.
4) The other important factor is that the maximum likelihood estimate of the parameter is associated with the chi-square distribution and this facilitates the testing of hypotheses.
In the case of other failure laws, the results are not available in closed form and are not easily tractable.Their hazard functions or failure rates could be increasing or decreasing, depending upon the values of their parameters and that is one of the reasons they fit most of the real life data.Sometimes two or more of these distributions fit very well to the same data, making it difficult which one to choose.
Most of the researchers in reliability theory often consider complex configurations and derive difficult analytical expressions, but because of the above difficulties they end up using the constant failure rate (CFR) of the exponential law which considerably simplifies the final result, without worrying about whether other distributions might fit their data better (Yadavalli and Hines [17]).
As mentioned in the abstract, due to the complex nature of operations, maintenance and inspection procedures and various other assignable and unassignable causes, the failure rates are not only time-dependent but often subject to random fluctuation.In such cases the traditional failure distribution approach will be not only inappropriate but would be misleading and erroneous.For this reason the authors suggest in this paper a non-traditional approach based on time series techniques and the Kalman filter.
The plan of the paper is as follows: Some acronyms and notation are given in Section 2.
More complex reliability systems are discussed in Section 3. Section 4 briefly reviews the ARIMA models with a few theorems for later use.The series given in Table 4.1 are analyzed and the time series models are fitted with a few forecasts in each case in Section 4. The Kalman filter approach and a numerical example are discussed in Section 5. Thereafter a list of references is provided.

RELIABILITY FUNCTIONS FOR COMPLEX SYSTEMS
On scanning the literature on reliability (Barlow [1] and Gnedenko et al [5]), one would find that the main characteristics such as reliability availability, MTTF etc. have been obtained for complex systems.Main algebraic vehicles to arrive at the desired results have been the development of the difference-differential equations under certain assumptions.Solutions of these equations have been obtained using the Laplace transforms.In some cases Boolean functions (or structure functions) and the software reliability models and convolutions have been used. .(See Figure 3.1).The corresponding reliabilities are defined in the same box.

Figure 3.1
Using the Boolean function technique, the authors have obtained the reliability S R of the whole system for the following three cases (i) For the simplest case i.e.
, the reliability of the whole system is at time point t (t is deleted for brevity and typographical convenience).

Example 3.2 (N-version programming in software)
Hishitani et al [7] have discussed the problem of reliability assessment for a software system.
Multiversion programming in software reliability is equivalent to hardware redundancy in system reliability.In other words the N-version programming is a realization of the parallel configuration of software.Considering a 3-version program system, the authors defined (i) 2 out of 3 versions are required for the software system to function properly.

(ii) ) (t F i
= the probability that the version i will cause failure of the system in the interval (0,t], i =1, 2, 3.
= the probability that the version I functions properly in the interval (0,t] without causing a failure of the system.
The reliability of the system is then given by Assuming that the failure times for the i-th version are negative exponential with parameters and (b) if the hypothesis about CFR is rejected, one will have to look for another failure law such as the Weibull, lognormal, normal or gamma etc.But the choice of any of these is again fraught with difficulties since (i) they involve more than one parameter (see Lawless [10], Nelson [11]).
(ii) the estimation of parameters and the testing of hypotheses may be tedious and involve iteration procedures.
(iii) they have increasing and decreasing failure rates depending on the values of parameters.
(iv) it may be difficult at times to distinguish between any two or more of them since they all might appear to fit the data equally well.
(v) the final results may be complicated due to interactions.
For these difficulties we suggest in this paper to resort to the time series techniques including the KF.In the following section we give a review of ARMA models as well as examples involving failure rates and the sums and products of two failure rates.4.1)

A SHORT REVIEW
From the ACF and PACF in Figure 4.1, it turns out that an AR(2) model defined by can be fitted to the series } { t Z , where c = 3.257307, 1 φ = 0.12659 and 2 φ = 0.50824.

Sums and products of ARMA models
In practice an ARMA model is more difficult to fit to a dataset than to any of its components, namely, AR or MA; although in certain aspects an ARMA model may be more efficient.
Furthermore if one fits both an AR(p) (or MA(q) ) and an ARMA( ) , 0 0 q p to the same data, one may find that the latter fits better with fewer parameters i.e. ) or ( 0 0 q p q p < + .On the other hand, while the mixed model ARMA may involve the estimation of fewer parameters, the fitted model may be more difficult to comprehend, interpret and explain the hidden characteristics of the data.The following three theorems are important in the context of reliability: where ) ..., , Theorem 4.2 (Engel [4], Singh [13]) ), ( = be k independent and stationary time series such that  We examined the failure rates of two types of V805 vacuum tubes used in transmitters (see Davis [3]) which are tabulated below along with their sums and products.The above procedure can be extended for fitting time series models to the sums and products of component reliabilities finally to get time series models for assessing reliabilities of the systems considered by, say Yadavalli et al [17] and Hishitani et al [7] by making proper use of the Theorems discussed in Section 4.1.

APPLICATION OF THE KALMAN FILTER
The observation and state equations of recursive estimation are (see Singh [14])

. 4 )
Similar to Examples 3.1 and 3.2 there exists a vast amount of material in the literature on Reliability Theory.The purpose of discussing these examples is to understand that (a) if the assumption of CFR is correct and confirmed by the estimation of parameters and the testing of hypotheses such as the goodness-of-fit test, one can proceed to analyze the data at hand.

2
Sums and Products of Failure Rates

Figure 4 . 3
Figure 4.3 Time plot of product of failure rates

Figure 4 . 4
Figure 4.4 ACF and PACF of the Sum of the failure rates.

Figure 4 .
5 is a time plot of the sum of the reliabilities and the values predicted by the AR(3) model.

Figure 4 . 5
Figure 4.5 Time plot of the sum of the reliabilities

Table 4 .
1Failure rates (F.R.) Air-conditioning systems of a fleet of jet airplanes.
An example of real life failure rates subjected to random fluctuations are given below for perusal.Example 4.1Proschan [12]considered the pooled data on times of successive failures of the air-conditioning system of a fleet of jet airplanes.The failure rates (F.R.) (in %) are displayed in Table4.1.

Table 4 .
2 Sums and Products of Two Failure Rates in % The ACF and PACF of the product of the two failure rates are plotted in Figure 4.2.Figure 4.2 ACF and PACF of the product of failure rates From the Figure 4.2, we notice that the ACF and PACF of the product of the failure rates 1 F We may use this model for forecasting purposes.The graph in Figure4.3 displays the actual product of reliabilities and the corresponding predicted values.