Concepts and similar pages to Page 178

Page 178 Concepts and similar pages

Concepts

Similarity

Concept

Statistical significance

Significance tests

Non parametric tests

Wilcoxon matched pairs test

Retrieval effectiveness

Generality

Data retrieval systems

Effectiveness

E measure

Relevance

that Di is continuous and that it is derived from a symmetric distribution,neither of which is normally met in IR data ...It seems therefore that some of the more sophisticated statistical tests are inappropriate ...The way it works is as follows:Let Za Q 1,Za Q 2,...P Za >Zb P Za <Zb [1]2 Under this hypothesis we expect the number of pairs which have Za >Zb to equal the number of pairs which have Za <Zb ...In IR this test is usually used as a one tailed test,that is,the alternative hypothesis prescribes the superiority of retrieval under condition a over condition b,or vice versa ...The use of the sign test raises a number of interesting points ...

145

automatic and interactive retrieval system?Studies to gauge this are going on but results are hard to interpret ...It should be apparent now that in evaluating an information retrieval system we are mainly concerned with providing data so that users can make a decision as to 1 whether they want such a system social question and 2 whether it will be worth it ...The second question what to evaluate?boils down to what can we measure that will reflect the ability of the system to satisfy the user ...1 The coverage of the collection,that is,the extent to which the system includes relevant matter;2 the time lag,that is,the average interval between the time the search request is made and the time an answer is given;3 the form of presentation of the output;4 the effort involved on the part of the user in obtaining answers to his search requests;5 the recall of the system,that is,the proportion of relevant material actually retrieved in answer to a search request;6 the precision of the system,that is,the proportion of retrieved material that is actually relevant ...It is claimed that 1 4 are readily assessed ...

163

normalising the ESL by a factor proportional to the expected number of non relevant documents collected for each relevant one ...which has been called the expected search length reduction factor by Cooper ...where 1 R is the total number of documents in the collection relevant to q;2 I is the total number of documents in the collection non relevant to q;3 S is the total desired number of documents relevant to q ...The explicit form for ESL was given before ...which is known as the mean expected search length reduction factor ...Within the framework as stated at the head of this section this final measure meets the bill admirably ...For a further defence of its subjective nature see Cooper [1]...

In the past there has been much debate about the validity of evaluations based on relevance judgments provided by erring human beings ...Effectiveness and efficiency Much of the research and development in information retrieval is aimed at improving the effectiveness and efficiency of retrieval ...

Concepts

Similar pages