that this principle works so well is not yet clear (but see Yu and Salton's recent theoretical paper[39]).
The connection with term clustering was already made earlier on in the chapter.
The spanning tree can be looked upon as a classification of the index terms.
One of the important consequences of the model described in this chapter is that it lays down precisely how the tree should be used in retrieval.
Earlier work in this area was rather ad hoc and did not lead to conclusive results[40].
It should be clear now that the quantitative model embodies within one theory such diverse topics as term clustering, early association analysis, document frequency weighting, and relevance weighting.
References
1. ROBERTSON, S.E. and SPARCK JONES, K., 'Relevance weighting of search terms', Journal of the American Society for Information Science, 27, 129-146 (1976)
2. van RIJSBERGEN, C.J., 'A theoretical basis for the use of co-occurrence data in information retrieval', Journal of Documentation, 33, 106-119 (1977).
3. BOOKSTEIN, A. and KRAFT, D., 'Operations research applied to document indexing and retrieval decisions', Journal of the ACM, 24, 410-427 (1977).
4. MARON, M.E., 'Mechanized documentation: The logic behind a probabilistic interpretation', In: Statistical Association Methods for Mechanized Documentation (Edited by Stevens et al.) National Bureau of Standards, Washington, 9-13 (1965).
5. OSBORNE, M.L., 'A Modification of Veto Logic for a Committee of Threshold Logic Units and the Use of 2-class Classifiers for Function Estimation', Ph.D. Thesis, Oregon State University (1975).
6. GOOD, I.J., Probability and the Weighting of Evidence, Charles Griffin and Co.Ltd., London (1950).
7. ROBERTSON, S.E., 'The probability ranking principle in IR', Journal of Documentation, 33, 294-304 (1977).
8. GOFFMAN, W., 'A searching procedure for information retrieval', Information Storage and Retrieval, 2, 294-304 (1977).
9. WILLIAMS, J.H., 'Results of classifying documents with multiple discriminant functions', In : Statistical Association Methods for Mechanized Documentation (Edited by Stevens et al.) National Bureau of Standards, Washington, 217-224 (1965).
10. DE FINETTI, B., Theory of Probability, Vol. 1, 146-161, Wiley, London (1974).
11. KU, H.H. and KULLBACK, S., 'Approximating discrete probability distributions', IEEE Transactions on Information Theory, IT-15, 444-447 (1969).
12. KULLBACK, S., Information Theory and Statistics, Dover, New York (1968).
13. CHOW, C.K. and LIU, C.N.,'Approximating discrete probability distributions with dependencetrees', IEEE Transactions on Information Theory, IT-14,462-467 (1968).
|