Similarity |
Page |
Snapshot |
| 128 |
objected to on the same grounds that one might object to the probability of Newton s Second Law of Motion being the case
...To approach the problem in this way would be useless unless one believed that for many index terms the distribution over the relevant documents is different from that over the non relevant documents
...The elaboration in terms of ranking rather than just discrimination is trivial:the cut off set by the constant in g x is gradually relaxed thereby increasing the number of documents retrieved or assigned to the relevant category
...If one is prepared to let the user set the cut off after retrieval has taken place then the need for a theory about cut off disappears
... |
| 129 |
we work with the ratio In the latter case we do not see the retrieval problem as one of discriminating between relevant and non relevant documents,instead we merely wish to compute the P relevance x for each document x and present the user with documents in decreasing order of this probability
...The decision rules derived above are couched in terms of P x wi
...I will now proceed to discuss ways of using this probabilistic model of retrieval and at the same time discuss some of the practical problems that arise
...The curse of dimensionality In deriving the decision rules I assumed that a document is represented by an n dimensional vector where n is the size of the index term vocabulary
... |
| 114 |
the system to its user will be the best that is obtainable on the basis of those data
...Of course this principle raises many questions as to the acceptability of the assumptions
...The probability ranking principle assumes that we can calculate P relevance document,not only that,it assumes that we can do it accurately
...So returning now to the immediate problem which is to calculate,or estimate,P relevance document
... |
| 117 |
D 1 and D 2 can be shown to be equivalent under certain conditions
...[P x w 1 P w 1 >P x w 2 P w 2 >x is relevant,x is non relevant]D 3 Notice that P x has disappeared from the equation since it does not affect the outcome of the decision
...[R w 1 x <R w 2 x][[equivalence]][l 21 l 11 P x w 1 P w 1 >l 12 l 22 P x w 2 P w 2]When a special loss function is chosen,namely,which implies that no loss is assigned to a correct decision quite reasonable and unit loss to any error not so reasonable,then we have [R w 1 x <R w 2 x [[equivalence]]P x w 1 P w 1 >P x w 2 P w 2]which shows the equivalence of D 2 and D 3,and hence of D 1 and D 2 under a binary loss function
...This completes the derivation of the decision rule to be used to decide relevance or non relevance,or to put it differently to retrieve or not to retrieve
...Form of retrieval function The previous section was rather abstract and left the connection of the various probabilities with IR rather open
... |
| 133 |
3
...It must be emphasised that in the non linear case the estimation of the parameters for g x will ideally involve a different MST for each of P x w 1 and P x w 2
...There is a choice of how one would implement the model for g x depending on whether one is interested in setting the cut off a prior or a posteriori
...If one assumes that the cut off is set a posteriori then we can rank the documents according to P w 1 x and leave the user to decide when he has seen enough
...to calculate estimate the probability of relevance for each document x
... |
| 112 |
of presenting the basic theory;I have chosen to present it in such a way that connections with other fields such as pattern recognition are easily made
...The fundamental mathematical tool for this chapter is Bayes Theorem:most of the equations derive directly from it
...This was recognised by Maron in his The Logic Behind a Probabilistic Interpretation as early as 1964 [4]...Remember that the basic instrument we have for trying to separate the relevant from the non relevant documents is a matching function,whether it be that we are in a clustered environment or an unstructured one
...It will be assumed in the sequel that the documents are described by binary state attributes,that is,absence or presence of index terms
...Estimation or calculation of relevance When we search a document collection,we attempt to retrieve relevant documents without retrieving non relevant ones
... |
| 6 |
language input and storage more feasible
...The reader will have noticed that already,the idea of relevance has slipped into the discussion
...Intellectually it is possible for a human to establish the relevance of a document to a query
...An information retrieval system Let me illustrate by means of a black box what a typical IR system would look like
...Starting with the input side of things
... |
| 119 |
and The importance of writing it this way,apart from its simplicity,is that for each document x to calculate g x we simply add the coefficients ci for those index terms that are present,i
...The constant C which has been assumed the same for all documents x will of course vary from query to query,but it can be interpreted as the cut off applied to the retrieval function
...Let us now turn to the other part of g x,namely ci and let us try and interpret it in terms of the conventional contingency table
...There will be one such table for each index term;I have shown it for the index term i although the subscript i has not been used in the cells
...This is in fact the weighting formula F 4 used by Robertson and Sparck Jones 1 in their so called retrospective experiments
... |
| 111 |
Six PROBABILISTIC RETRIEVAL Introduction So far in this book we have made very little use of probability theory in modelling any sub system in IR
...Perhaps it is as well to warn the reader that some of the material in this chapter is rather mathematical
... |