Page 158 Concepts and similar pages

Concepts

Similarity Concept
Logistic transformation
Retrieval effectiveness
Data retrieval systems
Effectiveness
Relevance
Data model
Document clustering
Probabilistic retrieval
Probability of relevance
Document representative

Similar pages

Similarity Page Snapshot
150 value can be calculated ...For a derivation of this relation from Bayes Theorem,the reader should consult the author s recent paper on retrieval effectiveness [10]...Averaging techniques The method of pooling or averaging of the individual P R curves seems to have depended largely on the retrieval strategy employed ...where As is the set of documents relevant to request s ...where B [[lambda]]s is the set of documents retrieved at or above the co ordination level [[lambda]]...Figure 7 ...An alternative approach to averaging is macro evaluation which can be independent of any parameter such as co ordination level ...
156 He accepts the validity of measuring the effectiveness of retrieval by a curve either precision recall or recall fallout generated by the variation of some control variable [[lambda]]e ...In the simplest case we assume that the variable [[lambda]]is distributed normally on the set of relevant and non relevant documents ...The usual set up in IR is now to define a decision rule in terms of [[lambda]],to determine which documents are retrieved the acceptance criterion ...
10 In the past there has been much debate about the validity of evaluations based on relevance judgments provided by erring human beings ...Effectiveness and efficiency Much of the research and development in information retrieval is aimed at improving the effectiveness and efficiency of retrieval ...
166 differ considerably from those which the user feels are pertinent Senko [21]...Fourthly,whereas Cooper has gone to some trouble to take account of the random element introduced by ties in the matching function,it is largely ignored in the derivation of Pnorm and Rnorm ...One further comment of interest is that Robertson 15 has shown that normalised recall has an interpretation as the area under the Recall Fallout curve used by Swets ...Finally mention should be made of two similar but simpler measures used by the SMART system ...and do not take into account the collection size N,n is here the number of relevant documents for the particular test query ...A normalised symmetric difference Let us now return to basics and consider how it is that users could simply measure retrieval effectiveness ...
155 and links IR measurements to a ready made and well developed statistical theory,it has not found general acceptance amongst workers in the field ...Before proceeding to an explanation of the Swets model,it is as well to quote in full the conditions that the desired measure of effectiveness is designed to meet ...A desirable measure of retrieval performance would have the following properties:First,it would express solely the ability of a retrieval system to distinguish between wanted and unwanted items that is,it would be a measure of effectiveness only,leaving for separate consideration factors related to cost or efficiency ...He then goes on to claim that The measure I proposed [in 1963],one drawn from statistical decision theory,has the potential [my italics]to satisfy all four desiderata ...To arrive at the measure,we must first discuss the underlying model ...Recall an estimate of the conditional probability that an item will be retrieved given that it is relevant [we denote this P B A]...Precision an estimate of the conditional probability that an item will be relevant given that it is retrieved [i ...Fallout an estimate of the conditional probability that an item will be retrieved given that it is non relevant [i ...
114 the system to its user will be the best that is obtainable on the basis of those data ...Of course this principle raises many questions as to the acceptability of the assumptions ...The probability ranking principle assumes that we can calculate P relevance document,not only that,it assumes that we can do it accurately ...So returning now to the immediate problem which is to calculate,or estimate,P relevance document ...