Concepts and similar pages to Page 119

Page 119 Concepts and similar pages

Concepts

Similarity

Concept

Information retrieval definition

Operational information retrieval

Index term weighting

Experimental information retrieval

Term

Relevance

3 ...It must be emphasised that in the non linear case the estimation of the parameters for g x will ideally involve a different MST for each of P x w 1 and P x w 2 ...There is a choice of how one would implement the model for g x depending on whether one is interested in setting the cut off a prior or a posteriori ...If one assumes that the cut off is set a posteriori then we can rank the documents according to P w 1 x and leave the user to decide when he has seen enough ...to calculate estimate the probability of relevance for each document x ...

108

If the summations instead of being over A and A are now made over A [[intersection]]Bi and A [[intersection]]Bi where Bi is the set of retrieved documents on the i th iteration,then we have a query formulation which is optimal for Bi a subset of the document collection ...where wi and w 2 are weighting coefficients ...Experiments have shown that relevance feedback can be very effective ...Finally,a few comments about the technique of relevance feedback in general ...Bibliographic remarks The book by Lancaster and Fayen [16]has written an interesting survey article about on line searching ...

137

the different contributions made to the measure by the different cells ...Discrimination gain hypothesis In the derivation above I have made the assumption of independence or dependence in a straightforward way ...P xi,xj P xi,xj w 1 P w 1 P xi,xi w 2 P w 2 P xi P xj [P xi w 1 P w 1 P xi,w 2 P w 2][P xj w 1 P w 1 P xj,w 2 P w 2]If we assume conditional independence on both w 1 and w 2 then P xi,xj P xi,w 1 P xj,w 1 P w 1 P xi w 2 P xj w 2 P w 2 For unconditional independence as well,we must have P xi,xj P xi P xj This will only happen when P w 1 0 or P w 2 0,or P xi w 1 P xi w 2,or P xj w 1 P xj w 2,or in words,when at least one of the index terms is useless at discriminating relevant from non relevant documents ...Kendall and Stuart [26]define a partial correlation coefficient for any two distributions by

collection ...I am arguing that in using distributional information about index terms to provide,say,index term weighting we are really attacking the old problem of controlling exhaustivity and specificity ...These terms are defined in the introduction on page 10 ...If we go back to Luhn s original ideas,we remember that he postulated a varying discrimination power for index terms as a function of the rank order of their frequency of occurrence,the highest discrimination power being associated with the middle frequencies ...Attempts have been made to apply weighting based on the way the index terms are distributed in the entire collection ...The difference between the last mode of weighting and the previous one may be summarised by saying that document frequency weighting places emphasis on content description whereas weighting by specificity attempts to emphasise the ability of terms to discriminate one document from another ...Salton and Yang [24]have recently attempted to combine both methods of weighting by looking at both inter document frequencies

128

objected to on the same grounds that one might object to the probability of Newton s Second Law of Motion being the case ...To approach the problem in this way would be useless unless one believed that for many index terms the distribution over the relevant documents is different from that over the non relevant documents ...The elaboration in terms of ranking rather than just discrimination is trivial:the cut off set by the constant in g x is gradually relaxed thereby increasing the number of documents retrieved or assigned to the relevant category ...If one is prepared to let the user set the cut off after retrieval has taken place then the need for a theory about cut off disappears ...

115

Basic probabilistic model Since we are assuming that each document is described by the presence absence of index terms any document can be represented by a binary vector,x x 1,x 2,...where xi 0 or 1 indicates absence or presence of the ith index term ...w 1 document is relevant w 2 document is non relevant ...The theory that follows is at first rather abstract,the reader is asked to bear with it,since we soon return to the nuts and bolts of retrieval ...So,in terms of these symbols,what we wish to calculate for each document is P w 1 x and perhaps P w 2 x so that we may decide which is relevant and which is non relevant ...Here P wi is the prior probability of relevance i 1 or non relevance i 2,P x wi is proportional to what is commonly known as the likelihood of relevance or non relevance given x;in the continuous case this would be a density function and we would write p x wi ...which is the probability of observing x on a random basis given that it may be either relevant or non relevant ...

113

any given document whether it is relevant or non relevant ...PQ relevance document where the Q is meant to emphasise that it is for a specific query ...P relevance document ...Let us now assume following Robertson [7]that:1 The relevance of a document to a request is independent of other documents in the collection ...With this assumption we can now state a principle,in terms of probability of relevance,which shows that probabilistic information can be used in an optimal manner in retrieval ...The probability ranking principle ...

114

the system to its user will be the best that is obtainable on the basis of those data ...Of course this principle raises many questions as to the acceptability of the assumptions ...The probability ranking principle assumes that we can calculate P relevance document,not only that,it assumes that we can do it accurately ...So returning now to the immediate problem which is to calculate,or estimate,P relevance document ...

125

document x for different settings of a pair of variables xi,xj i ...and similarly for the other three settings of xi and xj i ...This shows how simple the non linear weighting function really is ...Estimation of parameters The use of a weighting function of the kind derived above in actual retrieval requires the estimation of pertinent parameters ...Here I have adopted a labelling scheme for the cells in which [x]means the number of occurrences in the cell labelled x ...

Concepts

Similar pages