108

If the summations instead of being over A and `A are now made over A [[intersection]] Bi and `A [[intersection]] Bi where Bi is the set of retrieved documents on the ith iteration, then we have a query formulation which is optimal for Bi a subset of the document collection. By analogy to the linear classifier used before, we now add this vector to the query formulation on the ith step to get:

where wi and w2 are weighting coefficients. Salton[2] in fact used a slightly modified version. The most important difference being that there is an option to generate Qi+1 from Qi, or Q, the original query. The effect of all these adjustments may be summarised by saying that the query is automatically modified so that index terms in relevant retrieved documents are given more weight (promoted) and index terms in non-relevant documents are given less weight (demoted).

Experiments have shown that relevance feedback can be very effective. Unfortunately the extent of the effectiveness is rather difficult to gauge, since it is rather difficult to separate the contribution to increased retrieval effectiveness produced when individual documents move up in rank from the contribution produced when new documents are retrieved. The latter of course is what the user most cares about.

Finally, a few comments about the technique of relevance feedback in general. It appears to me that its implementation on an operational basis may be more problematic. It is not clear how users are to assess the relevance, or non-relevance of a document from such scanty evidence as citations. In an operational system it is easy to arrange for abstracts to be output but it is likely that a user will need to browse through the retrieved documents themselves to determine their relevance after which he is probably in a much better position to restate his query himself.

Bibliographic remarks

The book by Lancaster and Fayen[16] contains details of many operational on-line systems. Barraclough[17] has written an interesting survey article about on-line searching. Discussions on search strategies are usually found embedded in moregeneral papers on information

108