A note on the statistical analysis of point judgment matrices

The Analytic Hierarchy Process is a multicriteria decision making technique developed by Saaty in the 1970s. The core of the approach is the pairwise comparison of objects according to a single criterion using a 9-point ratio scale and the estimation of weights associated with these objects based on the resultant judgment matrix. In the present paper some statistical approaches to extracting the weights of objects from a judgment matrix are reviewed and new ideas which are rooted in the traditional method of paired comparisons are introduced.


Introduction
The Analytic Hierarchy Process is a multicriteria decision making technique developed by Saaty in the 1970s which has received considerable attention in the mathematical and statistical literature [11,18].The core of the approach is the pairwise comparison of objects according to a single criterion using a 9-point ratio scale and the estimation of weights associated with these objects based on the resultant judgment matrix.Saaty suggested that these estimates be taken to be proportional to the eigenvector corresponding to the largest right eigenvalue of the judgment matrix [17].However, his idea has been the subject of much criticism, in particular because it has a deterministic rather than a statistical basis [10].
The aim of this paper is to review the key statistical approaches to extracting the weights of objects from a judgment matrix and, in so doing, to relate the embedded models to the traditional linear models used in the method of paired comparisons [5].The remainder of the paper is organized as follows.§2 comprises a brief account of the method of paired comparisons and a formal specification of a judgment matrix.Statistical approaches to the analysis of judgment matrices are then introduced within the context of paired comparisons, with those which are distribution-based discussed in §3 and those based more directly on the method of paired comparisons in §4.An illustrative example is presented in §5 and conclusions and pointers for further research are given in §6.

Preliminaries
A brief introduction to paired comparisons and judgment matrices within the AHP context is presented in the following subsections.

The linear model for paired comparisons
In the traditional setting for paired comparisons, d decision makers are invited to compare n objects pairwise with respect to a single criterion and to state, quite simply, their preference for each pair with no ties permitted.The results can be assembled in a matrix D = {d ij } where d ij is the number of decision makers who prefer object A i to object A j for i = j, i, j = 1, . . ., n and thus d ij + d ji = d.A full account of the analysis of such data is given in the seminal book by David [5] and a brief summary is presented here.
The perceived underlying merit of object A i is taken to be a continuous random variable Y i with mean µ i and the probability that A i is preferred to A j can be introduced as ) and invoking subtle distributional arguments, it can be shown that (1) where H(•) is the distribution function of Z i − Z j .Note that the merits of the objects A i , with i = 1, 2, . . ., n, are chosen on a linear scale and, since the origin of the linear scale is arbitrary, it is usual to impose an additional constraint on the µ i such as n i=1 µ i = 0 or µ n = 0.It also follows immediately from result (1) that π ij = 1 2 if and only if µ i = µ j , that π ij > 1 2 if and only if µ i > µ j and that π ij < 1 2 if and only if µ i < µ j for i, j = 1, 2, . . ., n and i = j.
The preference probabilities π ij , for i, j = 1, 2, . . ., n, and i = j given in equation ( 1) are specified by the choice of distribution function H and two such choices are of particular interest.Specifically, in the Thurstone-Mosteller model [16,22], the variables Y i are taken to be normally distributed and it follows straightforwardly that where Φ(•) is the cumulative density function of the standard normal distribution and σ 2 is the variance of the difference Y i −Y j .In the more widely used Bradley-Terry model [3], the variables Y i are taken to follow independent extreme value distributions and the differences Y i − Y j therefore follow logistic distributions [6].It can then be shown that where the parameters π i have an immediate and natural interpretation as weights or probabilities associated with the objects A i , i = 1, 2, . . ., n, respectively.

Judgment matrices
Suppose now that n objects are again to be compared according to a particular criterion but that a single decision maker expresses his or her relative preferences on a ratio scale.Saaty [18] suggested choosing relative preferences on an ordinal scale of integers from 1 to 9, together with their reciprocals.However, other scales can be used.For example Becker and co-authors [2] use a 5-point Likert scale.More formally, let a ij denote the relative preference of object A i when compared with object A j for i, j = 1, 2, . . ., n.The relative preferences can then be assembled in a positive reciprocal matrix, termed a point judgment matrix, of the form A = {a ij } where a ij > 0, a ji = 1/a ij and a ii = 1 for i, j = 1, 2, . . ., n.
The entries of A can be regarded as the ratios of weights associated with the objects in the pairwise comparisons.Once the pairwise judgments have been elicited, the crucial question is how to determine the weights associated with the objects.

Distributional approaches
In this section three distributional approaches are now discussed.

Logarithmic least squares and the normal distribution
Crawford and Williams [4] introduced what is arguably the first statistical approach to the analysis of point judgment matrices, termed the logarithmic least squares method (LLSM), in 1985.More detailed work on the LLSM was developed later by Kabera [13], Kabera and Haines [14] and Laininen and Hämäläinen [15].However, the LLSM can be derived using the same arguments as those invoked in deriving the Thurstone-Mosteller model for paired comparisons and this insight is now discussed.
The relative preferences a ij can be expressed as where w i and w j are the unknown weights associated with the objects A i and A j respectively.Moreover, w i > 0 for i = 1, 2, . . ., n and n i=1 w i = 1 and e * ij is a positive error term which captures the inconsistency in the judgments.By invoking a logarithmic transformation, model (4) can be re-expressed as the linear model with no intercept where y ij = ln a ij , β i = ln w i , β j = ln w j and e ij = ln e * ij .Assuming that the differences y ij are observed values of random variables Y ij = Y i − Y j taken to be normally distributed as in the Thurstone-Mosteller model, that is Y ij ∼ N (β i −β j , σ 2 ), simple arguments similar to those developed in Kabera and Haines [14] yield the following results.Pairwise differences of the parameters, β i − β j , are estimable and it is straightforward to show that the least squares or, equivalently, the maximum likelihood estimates (MLEs) are given by Estimates of the weights associated with the objects can be formulated as Since these weights depend only on the unique estimates of β i − β j , they are unique.Approximate variances and covariances of the estimates w i , for i = 1, . . ., n, can be obtained by observing that V ar( β i − β j ) = 2σ 2 /n and by invoking the delta method to give and respectively.Estimates of these variances and covariances can immediately be obtained by the "plug-in" principle, that is by replacing the weights with their estimates and σ 2 with the residual variance s 2 .

The logistic distribution
The approach based on the logistic distribution for analyzing judgment matrices in the AHP was introduced by Haines and Litvine [9] in the context of interval judgments, and was fully developed for point judgment matrices by Kabera [13] and Kabera and Haines [14].The derivation of weights associated with objects compared pairwise and the corresponding covariance structure can be deduced directly from the Bradley-Terry model reviewed in §2 as follows.
Suppose that, in a pairwise comparison of n objects, the merit of object A i is assumed to be represented by an outcome of a random variable Y i which follows an extreme value distribution with probability density function (pdf) f (y i ) = e −(y i −θ i ) exp[−e −(y i −θ i ) ], with y i taking a value on the real line and the unknown parameter θ i > 0 for i = 1, 2, . . ., n.
Then, if Y i and Y j are independent, the random variable Y ij = Y i − Y j , for i, j = 1, 2, . . ., n and i = j, follows a logistic distribution with location parameter θ i −θ j and scale parameter 1 [6].For the Bradley-Terry model, the probability that object A i is preferred to object A j for 1 ≤ i < j ≤ n is simply and the logistic distribution itself is subsumed in this development [3,5].In contrast, suppose that the entry a ij in the judgment matrix A is a realization of a random variable A ij which represents the strength of object A i relative to object A j and is related to the variable Y ij as Y ij = ln(A ij ).In other words, A ij is assumed to be log-logistic.Then the likelihood for the parameters θ = (θ 1 , . . ., θ n ) associated with the judgment matrix A is given by where y ij = ln(a ij ) and the MLEs of θ can be readily obtained subject to a single constraint, for example θ n = 0, in order to ensure identifiability.Estimates of the weights follow as where θ i is the MLE of θ i , for i = 1, . . ., n.The asymptotic variances and covariances of the estimates of weights w i can be obtained using the delta method and are the same as those for the LLSM approach but with σ 2 replaced by 3 in expressions ( 5) and ( 6) respectively [14].

Discretization
An approach which recognizes that judgments represented on a ratio scale are in fact discrete was introduced by Kabera [13] and Kabera and Haines [14].The method is termed "discretization" and is a natural extension to the approaches based on the normal and logistic distributions described above.Specifically, the entries in a judgment matrix from the AHP are taken on an integer and reciprocal integer scale.Thus, strictly, these entries are realizations of a discrete random variable A * ij and Y * ij = ln A * ij is necessarily discretely distributed for 1 ≤ i < j ≤ n.In discussions so far, the random variable Y ij = ln(A ij ) has been assumed to be continuous.It is thus more correct to regard the event Y * ij = ln a ij , as equivalent to the event Let α = ln (a ij − 1)a ij and β = ln a ij (a ij + 1).Then it follows that where h(y ij ) is the pdf of y ij for 1 ≤ i < j ≤ n.Estimates for the unknown parameters can then be obtained by maximizing the likelihood i<j P (Y * ij = ln a ij ) or, equivalently, the log-likelihood subject to appropriate constraints and the variance-covariance matrix of the resultant estimates can be approximated by the inverse of the observed Fisher information matrix.Note that integrals of the form ( 7) can be evaluated explicitly in the case of the logistic distribution since but must be calculated numerically for the normal distribution.

Direct approaches
The question as to whether the traditional paired comparison models can be used more directly in modelling judgment matrices of the form A = {a ij } than the models discussed in §3 now arises and is addressed in the following subsections.

Genest-M'Lan approach
Genest and M'Lan [7] suggested that a ij , the relative preference for object A i relative to object A j recorded on a ratio scale such as the Saaty scale, can be interpreted as reflecting the fact that the objects have been compared n ij times where and thus that A i is preferred to A j x ij times where x ij is determined by With this interpretation, the pairwise comparison of objects A i and A j can be regarded as following a binomial distribution with the number of trials n ij taken to be independent, the number of successes given by x ij and the probability of success π ij taken to follow the Bradley-Terry model for paired comparisons.
This approach is indeed an interesting one but the following concerns should be noted.First the interpretation of the term n ij defined in expression (8) as the number of comparisons of objects A i and A j for 1 ≤ i < j ≤ n, implies that pairs of objects are not examined the same number of times.Second, and as an example, suppose that a 12 = 4.This can be interpreted as indicating that object A 1 is preferred to object A 2 in 4 out of 5, or 40 out 50, or 400 out of 500 comparisons or any number of preferences and trials preserving the proportion 4 : 5 in favour of object A 1 .These concerns indicate an arbitrariness in the selection of the number of pairwise comparisons of the objects.Genest and M'Lan [7] recognized these problems and suggested that the number of comparisons between all pairs of objects be taken to be equal and, specifically, to be the least common multiplier (LCM) of max {a ij , a ji } + 1.Thus for the Saaty scale the LCM of 2, 3, . . ., 10, is 2520.However, the number of comparisons still remains essentially arbitrary.An example illustrating the above concerns can be found in Kabera [13].In conclusion therefore the interpretation of the preferences in a point judgment matrix as emanating from a binomial model requires great caution.

A Distance approach
An alternative approach to that of Genest and M'Lan [7] based on the method of paired comparisons was developed in the thesis of Kabera [13] and is now introduced here.Specifically, the entry a ij in a judgment matrix A is interpreted as an odds ratio for the preference probability and thus as where p ij is the observed probability that object A i is preferred to object A j for 1 ≤ i < j ≤ n.Note that the relative preference a ij is taken to be on a ratio scale, for example (but not necessarily) the Saaty scale.Estimates of the parameters of models describing the preference probabilities π ij can be obtained as those values for which the observed probabilities p ij are as close as possible, in some sense, to the true values π ij , 1 ≤ i < j ≤ n.Two measures of "closeness," one based on least squares and the other on the Kullback-Leibler distance, are now considered and the ideas reinforced by invoking the Bradley-Terry model (3).

Least squares
Consider the least squares approach of David [5] which involves minimizing the sum of squares where the observed probabilities p ij are given by equation ( 9) and H(•) is given by ( 2).
For the Bradley-Terry model (3), µ i = ln(π i ) and so that estimates of the parameters are obtained by minimizing the expression Note immediately that this approach gives the same results as the logarithmic least squares method discussed in §3.1 (see also [13] and [15]), although the underlying "philosophy" is very different.There is therefore no need to pursue this method further.

The Kullback-Leibler distance
The Kullback-Leibler distance between the probabilities p ij and π ij for 1 ≤ i < j ≤ n, with respect to the distribution specified by the p ij is given by Consider now minimizing D and hence, since the probabilities p ij are observed, maximizing the expression with respect to the parameters on which the true probabilities π ij depend.Maximizing expression (11) is equivalent to maximizing the log-likelihood for a binomial distribution with numbers of trials all equal and with probabilities of success π ij .In particular note that the model proposed by Genest and M'Lan [7] gives the same parameter estimates as those obtained by maximizing (11) provided that all pairs of objects are assumed to be compared the same number of times.The present approach therefore gives some support to their methodology.
Consider now the case where the probability π ij that object A i is preferred to object A j follows the Bradley-Terry model (3), that is π ij = π i /(π i + π j ) for i, j = 1, 2, . . ., n and i = j.Then equation ( 11) can be written as Estimates of the weights which minimize the Kullback-Leibler distance (10) can be readily obtained by developing an iterative procedure which preserves the constraints that π i > 0 and n i=1 π i = 1 and which is similar to that developed for the method of paired comparisons [5].Specifically, using (12) and solving for ∂D * /∂π i = 0 gives where π i is an estimate of π i .This system of equations then forms the basis for an iterative scheme for finding the estimates π i , i = 1, 2, . . ., n.The calculations require the following steps.
(1) Choose starting values for the π i , such as π Continue the process until π (k+1) i is deemed to be sufficiently close to π i , say at iteration m, and then the final estimates are taken to be Alternatively the weights can be estimated by invoking a constrained optimization routine.
The Kullback-Leibler distance is not a likelihood function.Thus standard errors for the parameter estimates cannot be obtained from the asymptotic results of likelihood theory.However, the jackknife technique can be used to estimate the weights and to provide attendant standard errors.Specifically consider estimating the weights by omitting the comparison of objects A i and A j as the vector π (i,j) where the subscripts indicate the removed pairs for i, j = 1, 2, . . ., n.There are n 2 such estimates and these form the basis for finding an overall estimate π and the associated standard errors.

An example
The aim of this example is to fix ideas by comparing judgment weights and associated standard errors obtained using the Kullback-Leibler approach with those obtained using methods discussed earlier in this paper and in [14].Thus, Saaty [18, p. 38] considered the following example in which four identical chairs, labelled 1, 2, 3 and 4, were placed in a line at distances of 9, 15, 21 and 28 yards from a light source respectively.Chairs were compared according to brightness and the following judgment matrix assembled: The weights and their associated standard errors using the LLSM and the approaches based on the logistic distribution, discretization using the logistic distribution and the Kullback-Leibler distance are summarized in Table 1.For the LLSM, the coefficient of determination obtained from the analysis of variance (ANOVA) table was R 2 = 96.8%,indicating an excellent fit of the model to the data and more particularly indicating that the matrix A is close to being consistent [8].In addition, in terms of brightness, chair 1 was found to be significantly different to the others, chair 2 was significantly different from chair 4 but not chair 3, and chairs 3 and 4 were not significantly different from each other at the 5% level of significance.It is clear that the distribution-based methods give practically the same weights, while the Kullback-Leibler distance approach gives slightly different weights, but that the ranking of chairs with respect to brightness is the same for all the four methods.It is also clear that the standard errors associated with the estimates of the weights for the LLSM and the Kullback-Leibler approaches are smaller than those obtained from the logistic and discretization approaches.This result is not surprising since it was demonstrated in §3 that that the standard errors associated with the estimates of the weights from the logistic approach are 3/σ 2 larger than those obtained from the LLSM.In the present case this multiplier is approximately 3/s 2 = 4.7, where s 2 is the estimate of error obtained from the ANOVA table for the LLSM, in accord with the results in Table 1.Also, it is clear that if the judgment matrix is consistent, then the jackknife estimates will have zero error.Since the matrix A is close to being consistent, this observation is reflected in the small standard errors for the Kullback-Leibler approach given in Table 1 .

Conclusions
In this paper the connection between the traditional method of paired comparisons and a range of statistically based approaches to extracting weights for judgment matrices embedded in the Analytical Hierarchy Process is introduced and explored.The results from the example presented here, and indeed from other examples [13], suggest that the various statistical methods for deriving such weights produce estimates which are similar but attendant standard errors which are somewhat different.The question then arises as to which method is to be preferred.The approach based on the logistic distribution does not incorporate a scale parameter, while the standard errors for the estimates of the weights obtained from the Kullback-Leibler method tend to be understandably small.In contrast, the logarithmic least squares method includes a scale parameter in its model formulation and provides sound statistically based standard errors for the estimates of the weights.On balance therefore, the logarithmic least squares method, or the discretized version thereof, is recommended for deriving weights from judgment matrices and, furthermore, is readily implemented in practice.
There is scope for further research into statistical approaches for analyzing judgment matrices.In particular statistically based methods address rank reversal since standard errors are associated with estimates of the weights and thus the rankings are not stated with certainty.However, the weights are constrained to lie in a simplex and it would be useful to develop confidence regions for the weights rather than simply reporting the standard errors associated with their estimates.More generally, Stern [21] introduced a generalization of the traditional paired comparison model by introducing the notion that the random variables which describe the perceived values of objects follow a Gamma distribution with the same shape parameter but with different scale parameters.This approach is particularly interesting in that it relates to a Poisson process which can be construed as describing how a decision maker quantifies his or her preferences and could well be pursued within the context of extracting weights for judgment matrices.
Finally it should be emphasized that approaches to the analysis of judgment matrices within the context of the AHP, not all statistically based, are the subject of on-going research.Some key developments and new methodologies are presented in the papers by Bajwa [1], Jones and Mardle [12], Srdjevic [19], Srdjevic and Srdjevic [20] and Wang, Parkan and Luo [23,24].

Table 1 :
Estimates of the weights and associated standard errors for chair brightness.