Concept: Dependence stochastic

Dependence stochastic

Similar concepts

Similarity

Concept

Generality

Term

Relevance

Probability of relevance

Document clustering

Term dependence

Index term

Document representative

Typical document

Independence stochastic

Pages with this concept

Similarity

Page

Snapshot

120

convenience let us set There are a number of ways of looking at Ki ...Typically the weight Ki N,r,n,R is estimated from a contingency table in which N is not the total number of documents in the system but instead is some subset specifically chosen to enable Ki to be estimated ...The index terms are not independent Although it may be mathematically convenient to assume that the index terms are independent it by no means follows that it is realistic to do so ...

129

we work with the ratio In the latter case we do not see the retrieval problem as one of discriminating between relevant and non relevant documents,instead we merely wish to compute the P relevance x for each document x and present the user with documents in decreasing order of this probability ...The decision rules derived above are couched in terms of P x wi ...I will now proceed to discuss ways of using this probabilistic model of retrieval and at the same time discuss some of the practical problems that arise ...The curse of dimensionality In deriving the decision rules I assumed that a document is represented by an n dimensional vector where n is the size of the index term vocabulary ...

121

In general the dependence can be arbitrarily complex as the following identity illustrates,P x P x 1 P x 2 x 1 P x 3 x 1,x 2 ...Therefore,to capture all dependence data we would need to condition each variable in turn on a steadily increasing set of other variables ...where m 1,m 2,...Pt x P x 1 P x 2 x 1 P x 3 x 2 P x 4 x 2 P x 5 x 2 P x 6 x 5 Notice how similar the A 2 assumption is to the independence assumption A 1,the only difference being that in A 2 each factor has a conditioning variable associated with it ...The permutation and the function j ...write the function Pt x the way I did with xi as the unconditioned variable,and hence the root of the tree,and all others consistently conditioned each on its parent node,in fact any one of the nodes of the tree could be singled out as the root as long as the conditioning is done consistently with respect to the new root node ...