Similar concepts
Pages with this concept
Similarity |
Page |
Snapshot |
| 31 |
In practice many of thesauri are constructed manually
...1 words which are deemed to be about the same topic are linked;2 words which are deemed to be about related things are linked
...The first kind of thesaurus connects words which are intersubstitutible,that is,it puts them into equivalence classes
...The second kind of thesaurus uses semantic links between words to,for example,relate them hierarchically
...However,methods have been proposed to construct thesauri automatically
...The basic relationship underlying the automatic construction of keyword classes is as follows:If keyword a and b are substitutible for one another in the sense that we are prepared to accept a document containing one in response to a request containing the other,this will be because they have the same meaning or refer to a common subject or topic
...It is not difficult to see that,based on this principle,a classification of keywords can be automatically constructed,of which the classes are used analogously to those of the manual thesaurus mentioned before
...1 replace each keyword in a document and query representative by the name of the class in which it occurs;2 replace each keyword by all the keywords occurring in theclass to which it belongs
... |
| 48 |
The second criterion for choice is the efficiency of the clustering process in terms of speed and storage requirements
...Efficiency is really a property of the algorithm implementing the cluster method
...In the main,two distinct approaches to clustering can be identified:1 the clustering is based on a measure of similarity between the objects to be clustered;2 the cluster method proceeds directly from the object descriptions
...The most obvious examples of the first approach are the graph theoretic methods which define clusters in terms of a graph derived from the measure of similarity
...A string is a connected sequence of objects from some starting point
...A connected component is a set of objects such that each object is connected to at least one other member of the set and the set is maximal with respect to this property
...A maximal complete subgraph is a subgraph such that each node is connected to every other node in the subgraph and the set is maximal with respect to this property,i
...node were included anywhere the completeness condition would be violated
...A large class of hierarchic cluster methods is based on the initial measurement of similarity
... |
|
|