Dissimilarity coefficient

Similar concepts

Similarity Concept
Document clustering
Cluster methods
Document representative
Heuristic cluster methods
Clustering
Stratified hierarchic cluster methods
Maximally linked document
Cluster representative
Hierarchic cluster methods
Graph theoretic cluster methods

Pages with this concept

Similarity Page Snapshot
54 the hierarchy one can identify a set of classes,and as one moves up the hierarchy the classes at the lower levels are nested in the classes at the higher levels ...It is now a simple matter to define single link in terms of these graphs;at any level a single link cluster is precisely the set of vertices of a connected component of the graph at that level ...corresponding clusters at those levels ...
40 pertain to documents,such as index tags,being careful of course to deal with the same number of index tags for each document ...I now return to the promised mathematical definition of dissimilarity ...If P is the set of objects to be clustered,a pairwise dissimilarity coefficient D is a function from P x P to the non negative real numbers ...D 1 D X,Y >0 for all X,Y [[propersubset]]P D 2 D X,X 0 for all X [[propersubset]]P D 3 D X,Y D Y,X for all X,Y [[propersubset]]P Informally,a dissimilarity coefficient is a kind of distance function ...D 4 D X,Y <D X,Z D Y,Z which may be recognised as the theorem from Euclidean geometry which states that the sum of the lengths of two sides of a triangle is always greater than the length of the third side ...An example of a dissimilarity coefficient satisfying D 1 D 4 is where X [[Delta]]Y X [[union]]Y X [[intersection]]Y is the symmetric different of sets X and Y ...and is monotone with respect to Jaccard s coefficient subtracted from 1 ...
55 This description immediately leads to an inefficient algorithm for the generation of single link classes ...
45 efficiency of implementation for a particular application ...An example of an ordered classification is a hierarchy ...The discussion about classification has been purposely vague up to this point ...Let me know be more specific about current and past approaches to classification,particularly in the context of information retrieval ...The cluster hypothesis Before describing the battery of classification methods that are now used in information retrieval,I should like to discuss the underlying hypothesis for their use in document clustering ...A basic assumption in retrieval systems is that documents relevant to a request are separated from those which are not relevant,i ...a both of which are relevant to a request,and b one of which is relevant and the other non relevant ...Summing over a set of requests gives the relative distribution of