| 40 |
pertain to documents,such as index tags,being careful of course to deal with the same number of index tags for each document
...I now return to the promised mathematical definition of dissimilarity
...If P is the set of objects to be clustered,a pairwise dissimilarity coefficient D is a function from P x P to the non negative real numbers
...D 1 D X,Y >0 for all X,Y [[propersubset]]P D 2 D X,X 0 for all X [[propersubset]]P D 3 D X,Y D Y,X for all X,Y [[propersubset]]P Informally,a dissimilarity coefficient is a kind of distance function
...D 4 D X,Y <D X,Z D Y,Z which may be recognised as the theorem from Euclidean geometry which states that the sum of the lengths of two sides of a triangle is always greater than the length of the third side
...An example of a dissimilarity coefficient satisfying D 1 D 4 is where X [[Delta]]Y X [[union]]Y X [[intersection]]Y is the symmetric different of sets X and Y
...and is monotone with respect to Jaccard s coefficient subtracted from 1
... |