Concept: Information measure

Information measure

Similar concepts

Similarity

Concept

Information retrieval system

Information retrieval definition

Operational information retrieval

Data retrieval systems

Automatic document classification

Retrieval effectiveness

Experimental information retrieval

Cluster based retrieval

Document clustering

Pages with this concept

Similarity

Page

Snapshot

nice property of being invariant under one to one transformations of the co ordinates ...A function very similar to the expected mutual information measure was suggested by Jardine and Sibson [2]specifically to measure dissimilarity between two classes of objects ...Here u and v are positive weights adding to unit ...P x P x w 1 P w 1 P x w 2 P w 2 x 0,1 P x wi P x wi P x i 1,2 we recover the expected mutual information measure I x,wi ...

keyword is indicated by a zero or one in the i th position respectively ...where summation is over the total number of different keywords in the document collection ...Salton considered document representatives as binary vectors embedded in an n dimensional Euclidean space,where n is the total number of index terms ...can then be interpreted as the cosine of the angular separation of the two binary vectors X and Y ...where X,Y is the inner product and ...X x 1,...we get Some authors have attempted to base a measure of association on a probabilistic model [18]...When xi and xj are independent P xi P xj P xi,xj and so I xi,xj 0 ...

123

probability function P x,and of course a better approximation than the one afforded by making assumption A 1 ...The goodness of the approximation is measured by a well known function see,for example,Kullback [12];if P x and Pa x are two discrete probability distributions then That this is indeed the case is shown by Ku and Kullback [11]...is a measure of the extent to which P a x approximates P x ...If the extent to which two index terms i and j deviate from independence is measured by the expected mutual information measure EMIM see Chapter 3,p 41 ...then the best approximation Pt x,in the sense of minimising I P,Pt,is given by the maximum spanning tree MST see Chapter 3,p ...is a maximum ...One way of looking at the MST is that it incorporates the most significant of the dependences between the variables subject to the global constraint that the sum of them should be a maximum ...

138

where [[rho]]...[[rho]]X,Y W 0 which implies using the expression for the partial correlation that [[rho]]X,Y [[rho]]X,W [[rho]]Y,W Since [[rho]]X,Y <1,[[rho]]X,W <1,[[rho]]Y,W <1 this in turn implies that under the hypothesis of conditional independence [[rho]]X,Y <[[rho]]X,W or [[rho]]Y,W Hence if W is a random variable representing relevance then thecorrelation between it and either index term is greater than the correlation between the index terms ...Qualitatively I shall try and generalise this to functions other than correlation coefficients,Linfott [27]defines a type of informational correlation measure by rij 1 exp 2 I xi,xj [1 2]0 <rij <1 or where I xi,xj is the now familiar expected mutual information measure ...I xi,xj <I xi,W or I xj,W,where I ...Discrimination Gain Hypothesis:Under the hypothesis ofconditional independence the statistical information contained in oneindex term about another is less than the information contained ineither index term about relevance ...

second tree is quite different from the first,the nodes instead of representing clusters represent the individual objects to be clustered ...The MST contains more information than the single link hierarchy and only indirectly information about the single link clusters ...The representation of the single link hierarchy through an MST has proved very useful in connecting single link with other clustering techniques [51]...Implication of classification methods It is fairly difficult to talk about the implementation of anautomatic classification method without at the same time referring tothe file

136

probability functions we can write the information radius as follows:The interesting interpretation of the information radius that I referred to above is illustrated most easily in terms of continuous probability functions ...R u 1,u 2 v uI u 1 v vI u 2 v where I u i v measures the expectation on u i of the information in favour of rejecting v for u i given by making an observation;it may be regarded as the information gained from being told to reject v in favour of u i ...thereby removing the arbitrary v ...v u u 1 v u 2 that is,an average of the two distributions to be discriminated ...p x p x w 1 P w 1 p x w 2 P w 2 defined over the entire collection without regard to relevance ...There is one technical problem associated with the use of the information radius,or any other discrimination measure based on all four cells of the contingency table,which is rather difficult to resolve ...

193

BELL,J ...BELNAP,N ...BELNAP,N ...BENTLEY,J ...BENTLEY,J ...BERGE,C ...BERTZISS,A ...BONNER,R ...BONO,P ...BOOKSTEIN,A ...BOOKSTEIN,A ...BOOKSTEIN,A ...BOOKSTEIN,A ...BORKO,H ...BORKO,H ...BORKO,H ...BOULTON,D ...BOULTON,D ...BOX,G ...BROOKES,B ...BURKHARD,W ...BURKHARD,W ...CARROLL,J ...CAWKELL,A ...