Zipfs Law

Similar concepts

Similarity Concept
Low frequency words
High frequency words
Theory of measurement
Word type
Automatic content analysis
Fallout
Automatic text analysis
Conflation algorithm
Zipfian distribution
Document frequency weighting

Pages with this concept

Similarity Page Snapshot
188 In basing a theory of evaluation on the theory of measurement,is it possible to devise a measure of effectiveness not starting with precision and recall but simply with the set of relevant documents and the set of retrieved documents?If so,can we generalise such a measure to take account of degree of relevance?An alternative derivation of an E type measure could be done in terms of recall and fallout ...Up to now the measurement of effectiveness has proved fairly intractable to statistical analysis ...I think the Robertson model described in Chapter 7 goes some way to being considered as a reasonable statistical model ...There may be laws of retrieval such as the well known trade off between precision and recall that are worth establishing either empirically or by theoretical argument ...6 ...There is a need for more intensive research into the problems of what to use to represent the content of documents in a computer ...Information retrieval systems,both operational and experimental,have been keyword based ...The major reason for this rather simple minded approach to document retrieval is a very good one ...
15 linguistics in information science ...The chapter therefore starts with the original ideas of Luhn on which much of automatic text analysis has been built,and then goes on to describe a concrete way of generating document representatives ...Luhn s ideas In one of Luhn s [6]early papers he states:It is here proposed that the frequency of word occurrence in an article furnishes a useful measurement of word significance ...I think this quote fairly summaries Luhn s contribution to automatic text analysis ...Let f be the frequency of occurrence of various word types in a given position of text and r their rank order,that is,the order of their frequency of occurrence,then a plot relating f and r yields a curve similar to the hyperbolic curve in Figure 2 ...
33 Bibliographic remarks The early work of H ...References 1 ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...