Concept: Query expansion

Query expansion

Similar concepts

Similarity

Concept

Retrieval effectiveness

Information retrieval system

Data retrieval systems

Operational information retrieval

Information retrieval definition

Cluster based retrieval

Pages with this concept

Similarity

Page

Snapshot

If we think of a simple retrieval strategy as operating by matching on the descriptors,whether they be keyword names or class names,then expanding representatives in either of these ways will have the effect of increasing the number of matches between document and query,and hence tends to improve recall ...Recall is defined in the introduction ...Jones [41]has reported a large number of experiments using automatic keyword classifications and found that in general one obtained a better retrieval performance with the aid of automatic keyword classification than with the unclassified keywords alone ...Unfortunately,even here the evidence has not been conclusive ...The discussion of keyword classifications has by necessity been rather sketchy ...Normalisation It is probably useful at this stage to recapitulate and show how a number of levels of normalisation of text is involved in generating document representatives ...Index term weighting can also be thought of as a process of normalisation,if the weighting scheme takes into account the number of different index terms per document ...

134

which from a computational point of view would simplify things enormously ...An alternative way of using the dependence tree Association Hypothesis Some of the arguments advanced in the previous section can be construed as implying that the only dependence tree we have enough information to construct is the one on the entire document collection ...The basic idea underlying term clustering was explained in Chapter 2 ...If an index term is good at discriminating relevant from non relevantdocuments then any closely associated index term is also likely to begood at this ...

140

derives from the work of Yu and his collaborators [28,29]...According to Doyle [32]p ...The model in this chapter also connects with two other ideas in earlier research ...or in words,for any document the probability of relevance is inversely proportional the probability with which it will occur on a random basis ...

In practice many of thesauri are constructed manually ...1 words which are deemed to be about the same topic are linked;2 words which are deemed to be about related things are linked ...The first kind of thesaurus connects words which are intersubstitutible,that is,it puts them into equivalence classes ...The second kind of thesaurus uses semantic links between words to,for example,relate them hierarchically ...However,methods have been proposed to construct thesauri automatically ...The basic relationship underlying the automatic construction of keyword classes is as follows:If keyword a and b are substitutible for one another in the sense that we are prepared to accept a document containing one in response to a request containing the other,this will be because they have the same meaning or refer to a common subject or topic ...It is not difficult to see that,based on this principle,a classification of keywords can be automatically constructed,of which the classes are used analogously to those of the manual thesaurus mentioned before ...1 replace each keyword in a document and query representative by the name of the class in which it occurs;2 replace each keyword by all the keywords occurring in theclass to which it belongs ...