Page 169

169

The model

We start by examining the structure which it is reasonable to assume for the measurement of effectiveness. Put in other words, we examine the conditions that the factors determining effectiveness can be expected to satisfy. We limit the discussion here to two factors, namely precision and recall, although this is no restriction, different factors could be analysed, and, as will be indicated later, more than two factors can simplify the analysis.

If R is the set of possible recall values and P is the set of possible precision values then we are interested in the set R x P with a relation on it. We shall refer to this as a relational structure and denote it <R x P, >= > where >= is the binary relation on R x P. (We shall use the same symbol for less than or equal to, the context will make clear what the domain is.) All we are saying here is that for any given point (R, P) we wish to be able to say whether it indicates more, less or equal effectiveness than that indicated by some other point. The kind of order relation is a weak order. To be more precise:

Definition 1. The relational structure <R x P, >= > is a weak order if and only if for e1, e2, e3 [[propersubset]] R x P the following axioms are satisfied:

(1) Connectedness: either e1 >= e2 or e2 >= e1

(2) Transitivity: if e1 >= e2 and e2 >= e3 then e1 >= e3

We insist that if two pairs can be ordered both ways then (R1, P1) ~ (R2, P2), i.e. equivalent not necessarily equal. The transitivity condition is obviously desirable.

We now turn to a second condition which is commonly called independence. This notion captures the idea that the two components contribute their effects independently to the effectiveness.

Definition 2. A relation >= on R x P is independent if and only if, for R1, R2 [[propersubset]] R, (R1, P) >= (R2, P ) for some P [[propersubset]] P implies (R1, P' ) >= (R2, P' ) for every P' [[propersubset]] P; and for P1, P2 [[propersubset]] P, (R, P1) >= (R, P 2) for some R [[propersubset]] R implies (R', P1) >= (R', P 2) for every R '[[propersubset]] R.

All we are saying here is, given that at a constant recall (precision) we find a difference in effectiveness for two values of precision (recall) then this difference cannot be removed or reversed by changing the constant value.

We now come to a condition which is not quite as obvious as the preceding ones. To make it more meaningful I shall need to use a diagram, Figure 7.12, which represents the ordering we have got so far with definitions 1 and 2. The lines l1 and l2 are lines of equal effectiveness that is any two points (R, P ), (R', P' ) [[propersubset]]li are such that (R, P) ~ (R ', P ') (where ~ indicates equal effectiveness). Now let us

169