Concepts and similar pages to Page 160

Page 160 Concepts and similar pages

Concepts

Similarity

Concept

Simple ordering

Expected search length

Weak ordering

Information retrieval system

Data retrieval systems

Generality

Information measure

Retrieval effectiveness

Information retrieval definition

Operational information retrieval

Unfortunately the ranking generated by a matching function is rarely a simple ordering,but more commonly a weak ordering ...For example,consider the weak ordering in Figure 7 ...depending on how many non relevant documents precede the sixth relevant document ...4 10 ...The above procedure leads immediately to a convenient intuitive derivation of a formula for the expected search length ...

162

search with the relevant documents spaced evenly throughout that level ...a q is the query of given type;b j is the total number of documents non relevant to q in all levels preceding the final;c r is the number of relevant documents in the final level;d i is the number of non relevant documents in the final level;e s is the number of relevant documents required from the final level to satisfy the need according its type ...Now,to distribute the r relevant documents evenly among the non relevant documents,we partition the non relevant documents into r 1 subsets each containing i r 1 documents ...As a measure of effectiveness ESL is sufficient if the document collection and test queries are fixed ...where Q is the set of queries ...To extend the applicability of the measure to deal with varying test queries and document collections,we need to normalise the ESL in some way to counter the bias introduced because:1 queries are satisfied by different numbers of documents according to the type of the query and therefore can be expected to have widely differing search lengths;2 the density of relevant documents for a query in one document collection may be significantly different from the density in another ...The first item suggests that the ESL per desired relevant document is really what is wanted as an index of merit ...

163

normalising the ESL by a factor proportional to the expected number of non relevant documents collected for each relevant one ...which has been called the expected search length reduction factor by Cooper ...where 1 R is the total number of documents in the collection relevant to q;2 I is the total number of documents in the collection non relevant to q;3 S is the total desired number of documents relevant to q ...The explicit form for ESL was given before ...which is known as the mean expected search length reduction factor ...Within the framework as stated at the head of this section this final measure meets the bill admirably ...For a further defence of its subjective nature see Cooper [1]...

106

account of past performance ...Consider now a retrieval strategy that has been implemented by means of a matching function M ...It is the aim of every retrieval strategy to retrieve the relevant documents A and withhold the non relevant documents A ...the decision procedure M Q,D T >0 corresponds to a linear discriminant function used to linearly separate two sets A and A in R [t]...M Q 0,D >T whenever D [[propersubset]]A and M Q 0,D <T whenever D [[propersubset]][[Alpha]]The interesting thing is that starting with any Q we can adjust it iteratively using feedback information so that it will converge to Q 0 ...

145

automatic and interactive retrieval system?Studies to gauge this are going on but results are hard to interpret ...It should be apparent now that in evaluating an information retrieval system we are mainly concerned with providing data so that users can make a decision as to 1 whether they want such a system social question and 2 whether it will be worth it ...The second question what to evaluate?boils down to what can we measure that will reflect the ability of the system to satisfy the user ...1 The coverage of the collection,that is,the extent to which the system includes relevant matter;2 the time lag,that is,the average interval between the time the search request is made and the time an answer is given;3 the form of presentation of the output;4 the effort involved on the part of the user in obtaining answers to his search requests;5 the recall of the system,that is,the proportion of relevant material actually retrieved in answer to a search request;6 the precision of the system,that is,the proportion of retrieved material that is actually relevant ...It is claimed that 1 4 are readily assessed ...

Concepts

Similar pages