Discrimination gain hypothesis

Similar concepts

Similarity Concept
Relevance
Indexing
Probability of relevance
Document clustering
Document frequency weighting
Document representative
Index term
Maximally linked document
Typical document
Inverse document frequency weighting

Pages with this concept

Similarity Page Snapshot
138 where [[rho]]...[[rho]]X,Y W 0 which implies using the expression for the partial correlation that [[rho]]X,Y [[rho]]X,W [[rho]]Y,W Since [[rho]]X,Y <1,[[rho]]X,W <1,[[rho]]Y,W <1 this in turn implies that under the hypothesis of conditional independence [[rho]]X,Y <[[rho]]X,W or [[rho]]Y,W Hence if W is a random variable representing relevance then thecorrelation between it and either index term is greater than the correlation between the index terms ...Qualitatively I shall try and generalise this to functions other than correlation coefficients,Linfott [27]defines a type of informational correlation measure by rij 1 exp 2 I xi,xj [1 2]0 <rij <1 or where I xi,xj is the now familiar expected mutual information measure ...I xi,xj <I xi,W or I xj,W,where I ...Discrimination Gain Hypothesis:Under the hypothesis ofconditional independence the statistical information contained in oneindex term about another is less than the information contained ineither index term about relevance ...
137 the different contributions made to the measure by the different cells ...Discrimination gain hypothesis In the derivation above I have made the assumption of independence or dependence in a straightforward way ...P xi,xj P xi,xj w 1 P w 1 P xi,xi w 2 P w 2 P xi P xj [P xi w 1 P w 1 P xi,w 2 P w 2][P xj w 1 P w 1 P xj,w 2 P w 2]If we assume conditional independence on both w 1 and w 2 then P xi,xj P xi,w 1 P xj,w 1 P w 1 P xi w 2 P xj w 2 P w 2 For unconditional independence as well,we must have P xi,xj P xi P xj This will only happen when P w 1 0 or P w 2 0,or P xi w 1 P xi w 2,or P xj w 1 P xj w 2,or in words,when at least one of the index terms is useless at discriminating relevant from non relevant documents ...Kendall and Stuart [26]define a partial correlation coefficient for any two distributions by