Concept: Two Poisson model

Two Poisson model

Similar concepts

Similarity

Concept

Information retrieval system

Information retrieval definition

Operational information retrieval

Data retrieval systems

Retrieval effectiveness

Experimental information retrieval

Probabilistic retrieval

Information measure

Index term

Theory of measurement

Pages with this concept

Similarity

Page

Snapshot

subsets differing in the extent to which they are about a word w then the distribution of w can be described by a mixture of two Poisson distributions ...here p 1 is the probability of a random document belonging to one of the subsets and x 1 and x 2 are the mean occurrences in the two classes ...Although Harter [31]uses function in his wording of this assumption,I think measure would have been more appropriate ...assumption 1 we can calculate the probability of relevance for any document from one of these classes ...that is used to make the decision whether to assign an index term w that occurs k times in a document ...Finally,although tests have shown that this model assigns sensible index terms,it has not been tested from the point of view of its effectiveness in retrieval ...Discrimination and or representation There are two conflicting ways of looking at the problem of characterising documents for retrieval ...

Table 1 ...Data Retrieval DR Information Retrieval IR Matching Exact match Partial match,best match Inference Deduction Induction Model Deterministic Probabilistic Classification Monothetic Polythetic Query language Artificial Natural Query specification Complete Incomplete Items wanted Matching Relevant Error response Sensitive Insensitive between the two is a vague one ...Let us now take each item in the table in turn and look at it more closely ...The inference used in data retrieval is of the simple deductive kind,that is,a R b and b R c then a R c ...Another distinction can be made in terms of classifications that are likely to be useful ...The query language for DR will generally be of the artificial kind,one with restricted syntax and vocabulary,in IR we prefer to use natural language although there are some notable exceptions ...