| 27 |
Probabilistic indexing In the past few years,a detailed quantitative model for automatic indexing based on some statistical assumptions about the distribution of words in text has been worked out by Bookstein,Swanson,and Harter [29,30,31]...In their model they consider the difference in the distributional behaviour of words as a guide to whether a word should be assigned as an index term
...In general the parameter x will vary from word to word,and for a given word should be proportional to the length of the text
...The Bookstein Swanson Harter model assumes that specialty words are content bearing whereas function words are not
... |