Concepts and similar pages to Page 146

Page 146 Concepts and similar pages

Concepts

Similarity

Concept

Relevance

Precision

Retrieval effectiveness

Information measure

Theory of measurement

Information retrieval definition

Cluster based retrieval

Operational information retrieval

Effectiveness

E measure

effectiveness can be calculated to infinite precision we may be insisting on a difference when in fact it only occurs in the tenth decimal place ...Finally,although I have just explained the use of the sign test in terms of single number measures,it is also used to detect a significant difference between precision recall graphs ...Bibliographic remarks Quite a number of references to the work on evaluation have already been given in the main body of the chapter ...Buried in the report by Keen Digger [32]Chapter 16 is an excellent discussion of the desirable properties of any measure of effectiveness ...A parameter which I have mentioned in passing but which deserves closer study in generality ...The trade off between precision and recall has for a long time been the subject of debate ...Guazzo [39]describe an approach to the measurement of retrieval effectiveness based on information theory ...The notion of relevance has at all times attracted much discussion ...

In the past there has been much debate about the validity of evaluations based on relevance judgments provided by erring human beings ...Effectiveness and efficiency Much of the research and development in information retrieval is aimed at improving the effectiveness and efficiency of retrieval ...

150

value can be calculated ...For a derivation of this relation from Bayes Theorem,the reader should consult the author s recent paper on retrieval effectiveness [10]...Averaging techniques The method of pooling or averaging of the individual P R curves seems to have depended largely on the retrieval strategy employed ...where As is the set of documents relevant to request s ...where B [[lambda]]s is the set of documents retrieved at or above the co ordination level [[lambda]]...Figure 7 ...An alternative approach to averaging is macro evaluation which can be independent of any parameter such as co ordination level ...

145

automatic and interactive retrieval system?Studies to gauge this are going on but results are hard to interpret ...It should be apparent now that in evaluating an information retrieval system we are mainly concerned with providing data so that users can make a decision as to 1 whether they want such a system social question and 2 whether it will be worth it ...The second question what to evaluate?boils down to what can we measure that will reflect the ability of the system to satisfy the user ...1 The coverage of the collection,that is,the extent to which the system includes relevant matter;2 the time lag,that is,the average interval between the time the search request is made and the time an answer is given;3 the form of presentation of the output;4 the effort involved on the part of the user in obtaining answers to his search requests;5 the recall of the system,that is,the proportion of relevant material actually retrieved in answer to a search request;6 the precision of the system,that is,the proportion of retrieved material that is actually relevant ...It is claimed that 1 4 are readily assessed ...

148

relevant to an information need if and only if it contains at least one sentence which is relevant to that need ...Earlier on I stated that this notion of relevance was only of limited use at the moment ...Saracevic [8]has summarised some of the more recent work on probabilistic interpretations of relevance ...Precision and recall,and others We now leave the speculations about relevance and return to the promised detailed discussion of the measurement of effectiveness ...It is helpful at this point to introduce the famous contingency table which is not really a contingency table at all ...

166

differ considerably from those which the user feels are pertinent Senko [21]...Fourthly,whereas Cooper has gone to some trouble to take account of the random element introduced by ties in the matching function,it is largely ignored in the derivation of Pnorm and Rnorm ...One further comment of interest is that Robertson 15 has shown that normalised recall has an interpretation as the area under the Recall Fallout curve used by Swets ...Finally mention should be made of two similar but simpler measures used by the SMART system ...and do not take into account the collection size N,n is here the number of relevant documents for the particular test query ...A normalised symmetric difference Let us now return to basics and consider how it is that users could simply measure retrieval effectiveness ...

163

normalising the ESL by a factor proportional to the expected number of non relevant documents collected for each relevant one ...which has been called the expected search length reduction factor by Cooper ...where 1 R is the total number of documents in the collection relevant to q;2 I is the total number of documents in the collection non relevant to q;3 S is the total desired number of documents relevant to q ...The explicit form for ESL was given before ...which is known as the mean expected search length reduction factor ...Within the framework as stated at the head of this section this final measure meets the bill admirably ...For a further defence of its subjective nature see Cooper [1]...

114

the system to its user will be the best that is obtainable on the basis of those data ...Of course this principle raises many questions as to the acceptability of the assumptions ...The probability ranking principle assumes that we can calculate P relevance document,not only that,it assumes that we can do it accurately ...So returning now to the immediate problem which is to calculate,or estimate,P relevance document ...

108

If the summations instead of being over A and A are now made over A [[intersection]]Bi and A [[intersection]]Bi where Bi is the set of retrieved documents on the i th iteration,then we have a query formulation which is optimal for Bi a subset of the document collection ...where wi and w 2 are weighting coefficients ...Experiments have shown that relevance feedback can be very effective ...Finally,a few comments about the technique of relevance feedback in general ...Bibliographic remarks The book by Lancaster and Fayen [16]has written an interesting survey article about on line searching ...

Concepts

Similar pages