Concepts and similar pages to Page 176

Page 176 Concepts and similar pages

Concepts

Similarity

Concept

Presentation of experimental results

Thomsen condition

Retrieval effectiveness

Information structure

Experimental information retrieval

Information measure

Information retrieval definition

Operational information retrieval

Effectiveness

E measure

normalising the ESL by a factor proportional to the expected number of non relevant documents collected for each relevant one ...which has been called the expected search length reduction factor by Cooper ...where 1 R is the total number of documents in the collection relevant to q;2 I is the total number of documents in the collection non relevant to q;3 S is the total desired number of documents relevant to q ...The explicit form for ESL was given before ...which is known as the mean expected search length reduction factor ...Within the framework as stated at the head of this section this final measure meets the bill admirably ...For a further defence of its subjective nature see Cooper [1]...

180

effectiveness can be calculated to infinite precision we may be insisting on a difference when in fact it only occurs in the tenth decimal place ...Finally,although I have just explained the use of the sign test in terms of single number measures,it is also used to detect a significant difference between precision recall graphs ...Bibliographic remarks Quite a number of references to the work on evaluation have already been given in the main body of the chapter ...Buried in the report by Keen Digger [32]Chapter 16 is an excellent discussion of the desirable properties of any measure of effectiveness ...A parameter which I have mentioned in passing but which deserves closer study in generality ...The trade off between precision and recall has for a long time been the subject of debate ...Guazzo [39]describe an approach to the measurement of retrieval effectiveness based on information theory ...The notion of relevance has at all times attracted much discussion ...

In the past there has been much debate about the validity of evaluations based on relevance judgments provided by erring human beings ...Effectiveness and efficiency Much of the research and development in information retrieval is aimed at improving the effectiveness and efficiency of retrieval ...

188

In basing a theory of evaluation on the theory of measurement,is it possible to devise a measure of effectiveness not starting with precision and recall but simply with the set of relevant documents and the set of retrieved documents?If so,can we generalise such a measure to take account of degree of relevance?An alternative derivation of an E type measure could be done in terms of recall and fallout ...Up to now the measurement of effectiveness has proved fairly intractable to statistical analysis ...I think the Robertson model described in Chapter 7 goes some way to being considered as a reasonable statistical model ...There may be laws of retrieval such as the well known trade off between precision and recall that are worth establishing either empirically or by theoretical argument ...6 ...There is a need for more intensive research into the problems of what to use to represent the content of documents in a computer ...Information retrieval systems,both operational and experimental,have been keyword based ...The major reason for this rather simple minded approach to document retrieval is a very good one ...

172

structures are decomposable ...A further simplification of the measurement function may be achieved by requiring a special kind of non interaction of the components which has become known as additive independence ...R,P >R,P <>[[Phi]]1 R [[Phi]]2 P >[[Phi]]1 R [[Phi]]2 P where F is simply the addition function ...R,P >R,P <>[[Phi]]1 R [[Phi]]2 P [[Phi]]1 R [[Phi]]2 P >[[Phi]]1 R [[Phi]]2 P [[Phi]]1 R [[Phi]]2 P It can be shown that starting at the other end given an additively independent representation the properties defined in 1 and 3,and the Archimedean property are necessary ...Here the term [[Phi]]1 [[Phi]]2 is referred to as the interaction term,its absence accounts for the non interaction in the previous condition ...We are now in a position to state the main representation theorem ...Theorem Suppose <R x P,>>is an additive conjoint structure,then there exist functions,[[Phi]]1 from R,and [[Phi]]2 from P into the real numbers such that,for all R,R [[propersubset]]R and P,P [[propersubset]]P:R,P >R,P <>[[Phi]]1 R [[Phi]]2 P >[[Phi]]1 R [[Phi]]2 P If [[Phi]]i []are two other functions with the same property,then there exist constants [[Theta]]>0,[[gamma]]1,and [[gamma]]2 such that [[Phi]]1 [][[Theta]][[Phi]]1 [[gamma]]1 [[Phi]]2 [][[Theta]][[Phi]]2 [[gamma]]2 The proof of this theorem may be found in Krantz et al ...Let us stop and take stock of this situation ...

179

that Di is continuous and that it is derived from a symmetric distribution,neither of which is normally met in IR data ...It seems therefore that some of the more sophisticated statistical tests are inappropriate ...The way it works is as follows:Let Za Q 1,Za Q 2,...P Za >Zb P Za <Zb [1]2 Under this hypothesis we expect the number of pairs which have Za >Zb to equal the number of pairs which have Za <Zb ...In IR this test is usually used as a one tailed test,that is,the alternative hypothesis prescribes the superiority of retrieval under condition a over condition b,or vice versa ...The use of the sign test raises a number of interesting points ...

178

document collections with different sets of queries then we can still use these measures to indicate which system satisfies the user more ...Significance tests Once we have our retrieval effectiveness figures we may wish to establish that the difference in effectiveness under two conditions is statistically significant ...Parametric tests are inappropriate because we do not know the form of the underlying distribution ...On the face of it non parametric tests might provide the answer ...

162

search with the relevant documents spaced evenly throughout that level ...a q is the query of given type;b j is the total number of documents non relevant to q in all levels preceding the final;c r is the number of relevant documents in the final level;d i is the number of non relevant documents in the final level;e s is the number of relevant documents required from the final level to satisfy the need according its type ...Now,to distribute the r relevant documents evenly among the non relevant documents,we partition the non relevant documents into r 1 subsets each containing i r 1 documents ...As a measure of effectiveness ESL is sufficient if the document collection and test queries are fixed ...where Q is the set of queries ...To extend the applicability of the measure to deal with varying test queries and document collections,we need to normalise the ESL in some way to counter the bias introduced because:1 queries are satisfied by different numbers of documents according to the type of the query and therefore can be expected to have widely differing search lengths;2 the density of relevant documents for a query in one document collection may be significantly different from the density in another ...The first item suggests that the ESL per desired relevant document is really what is wanted as an index of merit ...

Five SEARCH STRATEGIES Introduction So far very little has been said about the actual process by which the required information is located ...All search strategies are based on comparison between the query and the stored documents ...The distinctions made between different kinds of search strategies can sometimes be understood by looking at the query language,that is the language in which the information need is expressed ...Boolean search A Boolean search strategy retrieves those documents which are true

Concepts

Similar pages